https://github.com/eliotjones/biocif
Parse the CIF file format for Protein Data Bank (PDB) data.
https://github.com/eliotjones/biocif
cif cif-formats crystallography csharp pdbx protein-data-bank
Last synced: 5 months ago
JSON representation
Parse the CIF file format for Protein Data Bank (PDB) data.
- Host: GitHub
- URL: https://github.com/eliotjones/biocif
- Owner: EliotJones
- License: mit
- Created: 2020-02-16T16:53:13.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2020-03-28T13:19:40.000Z (about 5 years ago)
- Last Synced: 2024-03-29T05:23:51.230Z (about 1 year ago)
- Topics: cif, cif-formats, crystallography, csharp, pdbx, protein-data-bank
- Language: C#
- Size: 1.63 MB
- Stars: 2
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Bio CIF #
BioCif is a small C# library designed to parse the [Crystallographic Information File](https://www.iucr.org/resources/cif) (CIF) format, the standard for information interchange in crystallography. It is designed to be fast and easy-to-use.
It provides access to both Tokenization and Parsing of CIF formats for both version 1.1 and version 2.0 as well as convenience wrappers for an API for the [Protein Data Bank](https://www.rcsb.org/) (PDB) data. The PDB hosts CIF format data (PDBx/mmCIF - Macro-molecular CIF) for protein structure.
## Usage ##
To access the raw stream of tokens:
using BioCif.Core.Tokenization;
using BioCif.Core.Tokens;using (var fileStream = File.Open(@"C:\path\to\data.cif"))
using (var streamReader = new StreamReader(fileStream))
{
foreach (Token token in CifTokenizer.Tokenize(streamReader))
{
Console.WriteLine(token.TokenType);
}
}To access the parsed CIF structure:
using (var fileStream = File.Open(@"C:\path\to\data.cif"))
{
Cif cif = CifParser.Parse(fileStream);DataBlock block = cif.DataBlocks[0];
Console.WriteLine($"Block name: {block.Name}");foreach (IDataBlockMember member in block.Members)
{
// ...
}
}To access a parsed PDBx/mmCIF:
Pdbx pdbx = PdbxParser.ParseFile(@"C:\path\to\mypdbx.cif");
PdbxDataBlock block = pdbx.First;
List auditAuthors = block.AuditAuthors;## Notes ##
Defined terms from the CIF specification:
+ data file - information relating to an experiment
+ dictionary file - contains information about data names
+ data name (AKA Tag): identifies the content of a data value
+ data value: string representing a value of any type.
+ data item: data name + data valueNotes on structures within a CIF file:
data block : highest level of cif file
data_
[data items or save frames]save frame: partitionaed collection of data items
save_
[data items]
save_ # Terminates the save frame
^ only used in dictionary files## Useful Links ##
+ Dictionary for PDBx/mmCIF data names: http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v50.dic/Index/
+ CIF Version 1.1 specification: https://www.iucr.org/resources/cif/spec/version1.1/cifsyntax
+ Search PDBx structures in the PDB: https://www.rcsb.org/#Category-search
+ Existing C# tools for CIF format among others: https://github.com/mindleaving/genome-tools
+ CIF on Wikipedia: https://en.wikipedia.org/wiki/Crystallographic_Information_File
+ Crystallography Open Database of non-mmCIF CIF files:http://www.crystallography.net/cod/index.php## Status ##
Early stage/incomplete/unmaintained.