Showing posts with label University of Edinburgh. Show all posts
Showing posts with label University of Edinburgh. Show all posts

Friday, 15 January 2016

PhD studentships on the ERC project Skye: A programming language bridging theory and practice for scientific data curation , University of Edinburgh

Project description

Science is increasingly data-driven. Scientific research funders now routinely mandate open publication of publicly-funded research data. Safely reusing such data currently requires labour-intensive curation. Provenance recording the history and derivation of the data is critical to reaping the benefits and avoiding the pitfalls of data sharing. There are hundreds of curated scientific databases in biomedicine that need fine-grained provenance; one important example is the IUPHAR/BPS Guide to Pharmacology database (GtoPdb), a pharmacological database developed in Edinburgh.
Currently there are no reusable methodologies or practical tools that support provenance for curated databases, forcing each project to start from scratch. Research on provenance for scientific databases is still at an early stage, and prototypes have so far proven challenging to deploy or evaluate in the field. Also, most techniques to date focus on provenance within a single database, but this is only part of the problem: real solutions will have to integrate database provenance with the multiple tiers of web applications, and no-one has begun to address this challenge.
The Skye project will build support for curation into the programming language itself, building on recent research on the Links Web programming language, including advances in language-integrated query, and on provenance and data curation. Links is a strongly-typed language that provides state-of-the-art support for language-integrated query and Web programming. This project will build on Links and other recent language designs for heterogeneous meta-programming to develop a new language, called Skye, that can express modular, reusable curation and provenance techniques. To keep focus on the real needs of scientific databases, Skye will be evaluated in the context of GtoPdb and other scientific database projects. Bridging the gap between curation research and the practices of scientific database curators will catalyse a virtuous cycle that will increase the pace of breakthrough results from data-driven science.
Skye will draw on the best ideas developed in cutting-edge research on language-integrated query, Web programming, and heterogeneous meta-programming. Skye will provide dialects, or first-class client language definitions, along with translations that map programs written in one dialect to another, or (as a special case) that perform source-to-source translation on a single dialect, for optimisation or to add functionality such as provenance- tracking. These translations will be available as libraries that can change the behaviour of already-written applications by rewriting code, so scientific database developers using Skye will be able to reuse these features instead of having to reimplement them from scratch or make wholesale changes to existing applications.
The Skye project will support a group of PhD students and postdoctoral researchers under the leadership of Dr. Cheney to pursue research on programming language design for integrating Web programming and databases, in aid of scientific data management. Topics for research could include:
  • Language design: How can we program with first-class client languages (dialects) and translations flexibly and safely?
  • Expressing and optimising client languages: How can existing client languages be embedded as dialects and translated to efficient client language code?
  • Defining modular curation techniques: How can existing (or new) curation techniques be defined using type-safe translations among dialects?
  • Case studies: What are the benefits and costs of using Skye to develop curated scientific databases?
For additional information about the project background, please consult our recent publications on this subject here and here.

PhD studentships

Two 4-year PhD studentships will be supported by the project. One studentship will cover full fees and a stipend of approximately £14,000 per year for a student of any nationality and the other will cover fees and stipend for a student with UK or EU citizenship. Additional funding may be available for exceptional candidates.

Background required

Both PhD studentships will carry out fundamental research relevant to the project. It is anticipated that one student will be recruited in 2016 (preferably with a project focusing on web and database programming language design and implementation) and the other in 2017 (preferably with a project focusing on data curation and applications of Skye).
Accordingly, we are currently seeking applicants with a strong background and research interests in functional programming, typed language design and implementation, metaprogramming, or Web/database programming. Candidates with demonstrated research ability in programming languages, and with at least some familiarity with databases or bioinformatics, would be especially suitable for this project. Outstanding candidates with a database-centric background and some familiarity with programming languages research will also be considered at this stage.

About the position

This PhD studentship provides a total of 4 years funding for full-time research (with no required teaching obligations). This can be structured as a 4-year PhD (for candidates who already have research experience e.g. a suitable Master's project) or as a 1-year Master's by Research followed by 3 years of PhD research, hosted in one of the School's two EPSRC Centres for Doctoral Training, in Data Science or Pervasive Parallelism. Prospective applicants are encouraged to discuss these alternatives with the supervisor before applying.

Application information

To apply, please follow the instructions here. Please make sure to indicate interest in this project as part of the application.

Source

Postdoctoral position on graph databases and provenance management, University of Edinburgh

Project: ADAPT: A Diagnostics Approach for Persistent Threat Detection
Supervisor:James Cheney
Deadline:February 12, 2015, 5pm GMT
We have an opening for a Research Associate in Graph Data Management on the project "ADAPT: A Diagnostics Approach to Advanced Persistent Threat Prevention", part of the "Transparent Computing" programme funded by the US Defence Advanced Research Projects Agency (DARPA).

Project description

"ADAPT: A Diagnostics Approach to Advanced Persistent Threat Prevention", is part of the "Transparent Computing" programme funded by the US Defence Advanced Research Projects Agency (DARPA). Transparent Computing (TC) is a $60 million research initiative to use provenance to improve security of critical systems in the face of advanced persistent threats (APTs), or attackers who gradually infiltrate a system in order to achieve long-term (and often highly damaging) objectives.

ADAPT is one of several TC-funded research projects working together to instrument mainstream systems to collect provenance, manage and analyse the resulting massive amounts of provenance graph data, and diagnose or identify potential attacks and attackers. ADAPT is a joint project between Galois Inc., Xerox PARC, Oregon State University, and the University of Edinburgh. The University of Edinburgh team is focusing on applying provenance and database expertise to support provenance graph queries and segmentation/feature extraction, needed by normalcy detection, classification, and diagnosis techniques provided by the other ADAPT partners.

About the position

This position is based at The University of Edinburgh. You are expected to take a leading role in investigating graph database techniques applied to provenance management problems in the ADAPT project. In particular, you are expected to play an important role in leading to the successful completion of one or more of the following project tasks:
  • Specify and implement techniques for identifying segments or features in provenance graphs
  • Analyse and improve the performance of such queries or extraction techniques on large-scale provenance graph data
  • If necessary, develop new query language features or optimizations tailored to the provenance graph queries needed by other parts of the project
  • Investigate incremental or stream-based techniques for extracting needed data from provenance graphs
  • Develop and implement abstractions for hierarchical modelling and causal linking of activities in provenance graphs, as needed by other parts of the project
The position will require system development as part of an international research project, as well as independent ideas-led research. You will be expected to work effectively with other researchers to produce prototypes, production-quality systems, high quality publications and demonstrations, and contribute to dissemination activities for the project, e.g. participating in project meetings and publishing papers in top conferences and journals. Duties will also include intermittent travel to project meetings.

Background required

The successful candidate will have expertise in databases, particularly graph data management, provenance management, query languages and optimisation, or incremental computation. The emphasis of this position is on systems-oriented research and development, so experience with database systems implementation is essential; programming languages preferred for the project include Haskell, Scala, and Python. Familiarity with virtualization technology (Docker), message queues (Apache Kafka), graph databases (Titan/Cassandra; Gremlin), or provenance querying and standards (e.g. W3C PROV) would be especially advantageous.
Applicants must, at a minimum, have a PhD degree (or be close to completion) in Computer Science, with either a track record of high quality publications or industrial experience adequate to the needs of the project. A strong background in graph databases/systems (or the ability to learn new systems quickly) is required. We expect that the project will involve practical systems development informed by conceptual or foundational research, so an ideal candidate will have strong development skills and the ability to engage with theory. Previous research experience on provenance or related topics such as machine learning/classification and information flow security would be desirable.
Please ensure that your application includes:
  • a CV listing relevant education, research experience and publications.
  • a 1-2 page statement of your research interests and how they relate to this position.
Applications that do not include these documents may not receive full consideration.

Duration and starting date

The postdoctoral position is available for 18 months starting on or as soon as possible after March 1, 2016.

The Transparent Computing programme as a whole will run from July 2015 until June 2019. This postdoctoral position is subject to extension beyond the initial 18 months, contingent on availability of funding.

Prospective applicants are encouraged to contact James Cheney (jcheney@inf.ed.ac.uk) before applying to discuss the position.

Application process and deadlines

A complete application consists of a CV and a 1-2 page research statement summarizing your background, previous research experience, and how they relate to this position.

Applications must be submitted by 5pm GMT on February 12, 2016, through the University of Edinburgh recruitment site:
https://www.vacancies.ed.ac.uk
Reference number: 035230
or directly by following this link:
direct link to the application site
Interviews will likely be held (either in person or via Skype) in mid-February.

Source