MyGene.info provides simple-to-use REST web services to query/retrieve gene annotation data. It's designed with simplicity and performance emphasized. A typical use case is to use it to power a web application which requires querying genes and obtaining common gene annotations. For example, MyGene.info services are used to power BioGPS.
NEW: Embed a gene query input field with autocomplete!
NEW: Python client released at PyPI! Get it by "pip install mygene".
NEW: Check out new "API" page!
To cite MyGene.info:
Wu C, MacLeod I, Su AI (2013) BioGPS and MyGene.info: organizing online, gene-entric information. Nucl. Acids Res. 41(D1): D561-D565.
MyGene.info provides two simple web services: one for gene queries and the other for gene annotation retrieval. Both return results in JSON format.
http://mygene.info/query
http://mygene.info/query?q=cdk2
http://mygene.info/query?q=cdk2+AND+species:human
http://mygene.info/query?q=cdk?
http://mygene.info/query?q=p*
http://mygene.info/query?q=entrezgene:1017
http://mygene.info/query?q=ensemblgene:ENSG00000123374
You can read the full description of our query syntax here.
NEW: We now support batch queries via POST. See how to make POST call.
http://mygene.info/gene/<geneid>
http://mygene.info/gene/1017
http://mygene.info/gene/ENSG00000123374
http://mygene.info/gene/1017?filter=name,symbol,summary
"<geneid>" can be any of Entrez or Ensembl Gene ids from supported species. A retired Entrez Gene id works too if it is replaced by a new one.
You can read the full description of our query syntax here.
NEW: We now support batch queries via POST. See "how to make POST call".
You can call MyGene.info services from either server-side or client-side (via AJAX). The sample code can be found at "demo" section.
All common programing languages provide functions for making http requests and JSON parsing. For Python, you can using build-in httplib and json modules (v2.6 up), or third-party httplib2 and simplejson modules. For Perl, LWP::Simple and JSON modules should work nicely.
When making an AJAX call from a web application, it is restricted by "same-origin" security policy.
To overcome "same-origin" restriction, you can create proxy at your server-side to our services. And then call your proxied services from your web application.
Setup proxy in popular server-side applications, like Apache, Nginx and PHP, are straightforward.
Because our core services are just called as simple GET http requests (though we support POST requests for batch queries too), you can bypass "same-origin" restriction by making JSONP call as well. To read more about JSONP, see 1, 2, or just Google about it. All our services accept an optional "jsoncallback" parameter, so that you can pass your callback function to make a JSONP call.
All popular javascript libraries have the support for making JSONP calls, like in JQuery, ExtJS, MooTools
(NEW) Cross-Origin Resource Sharing (CORS) specification is a W3C draft specification defining client-side cross-origin requests. It's actually supported by all major browsers by now (Internet Explorer 8+, Firefox 3.5+, Safari 4+, and Chrome), but not many people are aware of it. Unlike JSONP, which is limited to GET requests only, you can make cross-domain POST requests as well. Our services supports CORS requests on both GET and POST requests. You can find more information and use case here and here.
JQuery's native ajax call supports CORS since v1.5.
In this demo, we want to create a web site to display expression charts from a microarray dataset (Affymetrix MOE430v2 chip). The expression data are indexed by porobeset ids, but we need to allow users to query for any mouse genes using any commonly-used identifiers, and then display expression charts for any selected gene.
We implemented this demo in three ways:
It's a simple python CGI script. To run it, you just need to drop it to your favorite web server's cgi-bin folder (make sure your python, v2.6 up, is in the path).
This single python script can be used to run a standalone website. Just run: python mygene_info_demo_tornado.py.You then have your website up at http://localhost:8000.
Besides python (v2.6 up), you also need tornado to run this code. You can either install it by your own, or download this zip file, which includes tornado in it.
The zip file contains one html file and one javascript file. There is no server-side code at all. To run it, just unzip it and open the html file in any browser. All remote service calls are done at client side (via browsers). Put the files into any web server serving static files will allow you to publish to the world.
(NEW)
The zip file contains one html file and one javascript file. There is no server-side code at all. To run it, just unzip it and open the html file in any browser. All remote service calls are done at client side (via browsers). Put the files into any web server serving static files will allow you to publish to the world.
This demo is almost the same as the one using JSONP, except that the actual AJAX call to MyGene.info server is made via CORS.
MyGene.info is built on CouchDB, a document-based database. Unlike more commonly used relational database systems (e.g., Oracle, MySQL), data are stored as "key-document" pairs. The "document" is a JSON-formatted gene annotation object, while the "key" is a gene ID (Entrez or Ensembl). The hierarchical structure of gene annotation data can be represented naturally in this key-document model. This simple object structure in CouchDB greatly simplified both data loading and data queries, and also gains impressive query performance.
On top of CouchDB, we use tornado, a lightweighted and fast web framework in python, to build our application layer. And then Nginx is used as the front-end to serve outside requests.
The source code of MyGene.info are hosted at bitbucket
We currently support all genes from nine species:
| Common name | Genus name | Taxonomy id | Genome assembly |
|---|---|---|---|
| human | Homo sapiens | 9606 | hg19 |
| mouse | Mus musculus | 10090 | mm9 |
| rat | Rattus norvegicus | 10116 | rn4 |
| fruitfly | Drosophila melanogaster | 7227 | dm3 |
| nematode | Caenorhabditis elegans | 6239 | ce7 |
| zebrafish | Danio rerio | 7955 | danRer6 |
| thale-cress | Arabidopsis thaliana | 3702 | NA |
| frog | Xenopus tropicalis | 8364 | xenTro2 |
| pig | Sus scrofa | 9823 | susScr2 |
Gene annotation data are regularly updated once per month. The more updated data information can be access here.
Your feedback to help@mygene.info is welcome.