Skip to content

Description of mgj4 index configuration #25

@MikeHopcroft

Description

@MikeHopcroft

This issue tracks how we configure / intend to configure the mg4j index for maximum performance in the experiment.

  1. [Not implemented] Disable positions.
  2. Disable scoring.
  3. [Not verified]. Use BitStreamHPIndexReader. Actually, we probably want to use the subclass InMemoryHPIndex. Check to see if this is used by default. Right now it looks like the code uses QuasiSuccinctIndex.
  4. [Not verified]. Use in-memory index. See JavaDocs for Index.UriKeys.
  5. [Not verified]. Use wired index.
  6. No stemming.
  7. No stop word elimination.
  8. ??? Disable advanced queries (e.g. near, WAND, phrase).
  9. ??? Disable forward index storage for titles.
  10. ??? Disable forward index storage for BM25F scoring information.
  11. Exporter for Partitioned Elias-Fano index generates a frequency of 1 for every posting.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions