Example of Inverted-file Index Construction
Let us consider a small imaginary protein A with 11 AA residues and 2 SSEs.
(All the measurements are in Angstroms.)
Figure S1: The C-alpha coordinates and SSE annotations in protein A.
We can represent protein A as a distance matrix.
Figure S2: Distance matrix of Protein A. Shaded areas are the inter-SSE contact regions.
We can represent the two SSEs as vectors in 3D space.
Figure S3: Vector representation of two SSEs in protein A.
We can extract a feature vector from each contact region Kab formed by SSE a and SSE b.
Figure S4: 3 feature vectors for 3 contact regions in protein A.
We hash the feature vectors into a hash table which points to the posting lists of the proteins in which they occur.
Figure S5: Sample portion of an inverted-file index.