Читать книгу Pathology of Genetically Engineered and Other Mutant Mice - Группа авторов - Страница 24
FAIR Data Access and Databases of Databases
ОглавлениеThe move to make scientific data openly discoverable, accessible, and useable has led to the formalization of the principles of open data, known as FAIRsharing [2]. Data has to be free, accessible, interoperable, and reusable. Increasingly databases are trying to comply with the FAIR semantic and access technical requirements as part of data discovery and interoperability, and many databases now provide access computationally through an Application Programming Interface (API) as well as through Hypertext Markup Language (HTML), together with complete data dumps and data reports. Searching using standardized semantics and ontologies provides very powerful ways of searching databases, and for databases such as MouseMine it is possible to use very powerful SPARQL Protocol and RDF (Resource Description Framework) Query Language (SPARQL) queries.
Along with standardization of database structures and access goes formalization of standards, essential for interoperability and data aggregation. Different communities have established specific standards for metadata, minimal information (MI) for reporting (MI standards), and other data structures. These standards and the characteristics of databases can now be found in databases of databases such as the very useful FAIRsharing database. FAIRsharing lists and describes standards used in databases but also the type of data and importantly the species. Currently, FAIR sharing lists 90 databases either dedicated to the mouse or with specifically mouse data [2].
The Re3data repository [3] lists 57 mouse data resources, interestingly with limited overlap with those from FAIRsharing, indicating the usefulness of looking at as many metaresources as possible. Of course, many databases, particularly small but often valuable ones, are not listed in these resource repositories. It is one aim in this chapter to introduce some of these smaller databases that are nevertheless very valuable.
Table 2.1 Websites with information useful for phenotyping mutant laboratory mice.
Website name | Address | Data resource |
---|---|---|
Alliance of Genome Resources | https://www.alliancegenome.org | Identification of animal models for human disease |
Atlas of mouse cardiovascular development | https://www.devbio.pitt.edu/research/atlas‐mouse‐cardiovascular‐development | Atlas of the mouse cardiovascular system using EFIC |
Bonebase | http://www.bonebase.org | Variation in bone mass using μCT and histomorphometry for KOMP mice |
Emouseatlas | https://www.emouseatlas.org/emap/home.html | Mouse embryonic anatomy and development |
GUDMAP | https://www.gudmap.org | Urogenital and prostate development with a collection of immunohistochemical, histological, and immunofluorescence images, and gene expression data |
International Mouse Phenotyping Consortium (IMPC) | https://www.mousephenotype.org | Summaries of phenotyping of genetically engineered mice in the KOMP program |
Monarch | https://monarchinitiative.org | Identification of animal models for human disease |
Mouse Brain Atlas | https://developingmouse.brain‐map.org, https://mouse.brain‐map.org | High‐resolution physical atlases of the adult and embryonic mouse brains |
Mouse Genome Informatics | http://www.informatics.jax.org | Large datasets on mice, genetics, inbred strains, mutant lines, etc. |
Mouse Genome Informatics SNP Database | http://www.informatics.jax.org/snp | Single nucleotide polymorphisms for inbred strains |
MouseMine | http://www.mousemine.org/mousemine/begin.do | Comparative human and mouse data |
Mouse Models of Human Cancer Database (MMHCD) | http://tumor.informatics.jax.org/mtbwi/index.do | Mouse models of human cancer data and images |
Mouse Phenome Database | https://phenome.jax.org | Variety of data on studies that compare inbred strains |
MUTAGENETIX | https://mutagenetix.utsouthwestern.edu | Immunological phenotypes of single gene mouse mutants produced by ENU mutagenesis |
National Toxicology Program Neoplastic Lesion Atlas | https://ntp.niehs.nih.gov/nnl/index.htm | Atlas of images and descriptions on nonneoplastic lesions in rats and mice |
Noah's Arkive | http://noahsarkive.cldavis.org/cgi‐bin/show_image_info_page.cgi | Images of lesions from many different species |
Online Mendelian Inheritance in Man (OMIM) | https://www.ncbi.nlm.nih.gov/omim | Summaries with references on genetic based diseases in humans with reference to animal models |
Pathbase, The European Mouse Pathology Database | http://www.pathbase.net | Images of lesions found in mice (spontaneous, genetically engineered, etc.) |
PATHBIO | http://www.pathbio.org | Training modules, X‐ray atlas, lymphatic mapping atlas, images of inbred strain specific diseases |
Sanger Mouse SNP Database | https://www.sanger.ac.uk/sanger/Mouse_SnpViewer/rel‐1505 | Single nucleotide polymorphisms for 36 inbred strains |
Skinbase | http://eulep.pdn.cam.ac.uk/~skinbase/index.php | Subdirectory of Pathbase focusing on normal and abnormal skin phenotypes in mice |
Many archival databases and some reference resources are very dependent on users to curate, correct uploads, and to give them added value for a community. Databases run for the use of specific communities can provide a high degree of added value at low cost and contribute heavily to scientific sustainability. Image databases especially fall into this category and in particular need new material from experts in the field to constantly expand the resource and accurately curate the material. Pathologists can use these resources as legacy tools for extra photomicrographs they take while working up new models or defining spontaneous diseases in colonies. Such additional images can be referenced in the primary publication as being publicly available on the websites, which usually is very well received by journals and reviewers on future grant applications. Many of the current public repositories will not accept files of the size of zoomable histopathology images and limited user uploads. This, together with payment to commercial repositories is a specific problem for very image intensive histopathology and cell biology data.