text_indexing 1.0.0
text_indexing: ^1.0.0 copied to clipboard
Dart library for creating an inverted index on a collection of text documents.
1.0.0 #
- Stable release
0.23.0 #
BREAKING CHANGES
Breaking changes #
This is a major re-work of the library with a significant simplification of the interfaces:
- Interface
TextTokenizerremoved. UseTextAnalyzer.tokenizeandTextAnalyzer.tokenizeJsonin stead. - Mixin
InvertedIndexMixinremoved. - Instance method
InvertedIndex.getFtdPostingsremoved, use static methodInvertedIndex.ftdPostingsFromPostingsin stead. - Instance method
InvertedIndex.getIdFtIndexremoved, use static methodInvertedIndex.idFtIndexFromDictionaryin stead. - Instance method
InvertedIndex.getTfIndexremoved, use static methodInvertedIndex.tfIndexFromPostingsin stead. - Instance method
InvertedIndex.removed, use static methodInvertedIndex.in stead. - Extension methods
Iterable<DftMapEntry>.sortandIterable<DftMapEntry>.toListremoved. - Property
InvertedIndex.tokenFilterremoved. - Class
TextIndexerBaseremoved. - Interface method
TextIndexer.indexDocumentStreamadded and implemented inTextIndexerMixin. - Interface method
TextIndexer.indexCollectionStreamadded and implemented inTextIndexerMixin. - Factories
TextIndexer,TextIndexer.streamandTextIndexer.collectionStreamremoved. - Signatures changed of interface methods
TextIndexer.indexText,TextIndexer.indexJsonandTextIndexer.indexCollection. - Interface method
TextIndexer.dispose()added. - Enum
TermSortStrategyremoved. - Enum
TokenizingStrategyremoved. - Interface
TextIndexerimplemented inInvertedIndex. - Changed signature of
TextIndexer.indexText. - Changed signature of
TextIndexer.indexDocumentStream. - Changed signature of
TextIndexer.indexJson. - Changed signature of
TextIndexer.indexCollectionStream.
New #
- Added
InMemoryIndexBaseandAsyncCallbackIndexBasetotext_indexinglibrary exports.
Bug fix #
- Fixed keyword postings in indexer.
Updated #
- Dependencies.
- Tests.
- Documentation
- Examples.
0.23.0-5 #
0.23.0-3 #
0.23.0-2 #
0.23.0-1 #
BREAKING CHANGES
Breaking changes #
This is a major re-work of the library with a significant simplification of the interfaces:
- Interface
TextTokenizerremoved. UseTextAnalyzer.tokenizeandTextAnalyzer.tokenizeJsonin stead. - Mixin
InvertedIndexMixinremoved. - Instance method
InvertedIndex.getFtdPostingsremoved, use static methodInvertedIndex.ftdPostingsFromPostingsin stead. - Instance method
InvertedIndex.getIdFtIndexremoved, use static methodInvertedIndex.idFtIndexFromDictionaryin stead. - Instance method
InvertedIndex.getTfIndexremoved, use static methodInvertedIndex.tfIndexFromPostingsin stead. - Instance method
InvertedIndex.removed, use static methodInvertedIndex.in stead. - Extension methods
Iterable<DftMapEntry>.sortandIterable<DftMapEntry>.toListremoved. - Property
InvertedIndex.tokenFilterremoved. - Class
TextIndexerBaseremoved. - Interface method
TextIndexer.indexDocumentStreamadded and implemented inTextIndexerMixin. - Interface method
TextIndexer.indexCollectionStreamadded and implemented inTextIndexerMixin. - Factories
TextIndexer,TextIndexer.streamandTextIndexer.collectionStreamremoved. - Signatures changed of interface methods
TextIndexer.indexText,TextIndexer.indexJsonandTextIndexer.indexCollection. - Interface method
TextIndexer.dispose()added. - Enum
TermSortStrategyremoved. - Enum
TokenizingStrategyremoved. - Interface
TextIndexerimplemented inInvertedIndex.
Updated #
- Dependencies.
- Tests.
- Documentation
- Examples.
0.22.4+15 #
Deprecated #
- Interface
TextTokenizeris deprecated and will be removed from the next stable version oftext_analysislibrary. At that timetext_indexerwill be updated to accomodate the change and issued as version 0.23.0.
0.22.4+14 #
0.22.4+13 #
Updated #
- Bumped dependency
text_analysisto ver0.23.7+12. - Changed
InvertedIndex.nGramRangeto nullable.
0.22.4+12 #
Updated #
- Bumped dependency
text_analysisto ver0.23.7+11. - Changed algo for extension method
JSON.toSourceText.
0.22.2 #
0.22.0 #
Breaking changes #
- Added method
InvertedIndex.getCollectionSize. - Implemented
InvertedIndex.getCollectionSize. - Implemented
AsyncCallbackIndex.getCollectionSize. - Renamed function definition
VocabularyLengthtoCollectionSizeCallback. - Changed signature of factory
InvertedIndex.inMemory. - Changed signature of unnamed factory constructor
InvertedIndex.
Updated #
- Dependencies.
- Tests.
- Examples.
- Documentation.
0.21.1 #
0.21.0 #
Breaking changes #
- Added field
TokenizingStrategy InvertedIndex.strategy. - Added field
InvertedIndex.keywordExtractor. - Added method
InvertedIndex.getKeywordPostings. - Added method
InvertedIndex.upsertKeywordPostings. - Changed signature of method
TextIndexer.updateIndexes. - Changed signature of default
InMemoryIndexconstructor. - Changed signature of default
AsyncCallbackIndexconstructor. - Changed signature of default
InvertedIndexfactory constructor. - Changed signature of default
InvertedIndex.inMemoryfactory constructor.
New #
- Added typedef
KeywordPostingsMap. - Added typedef
KeyWordPostings. - Added function definition
KeywordPostingsMapLoader. - Added function definition
KeywordPostingsMapUpdater. - Added base class
InMemoryIndexBase. - Implemented field
AsyncCallbackIndex.strategy. - Implemented field
InMemoryIndex.strategy. - Implemented
InMemoryIndex.getKeywordPostings. - Implemented
InMemoryIndex.upsertKeywordPostings. - Implemented
InMemoryIndex.keywordExtractor. - Implemented
AsyncCallbackIndex.getKeywordPostings. - Implemented
AsyncCallbackIndex.upsertKeywordPostings. - Implemented
AsyncCallbackIndex.keywordExtractor.
Updated #
- Dependencies.
- Tests.
- Examples.
- Documentation.
0.19.0 #
Breaking changes #
- Changed signature of
TextIndexerdefault unnamed factory constructor. - Removed field
TextIndexer.documentStream. - Removed field
TextIndexer.collectionStream. - Added field
InvertedIndex.nGramRange. - Changed the signature of
InvertedIndexunnamed factory. - Changed the signature of
InvertedIndex.inMemoryfactory. - Changed the signature of
AsyncCallbackIndexdefault constructor. - Changed the signature of
InMemoryIndexdefault constructor. - Removed field
InvertedIndex.phraseLength.
New #
- Added factory constructor
TextIndexer.collectionStream. - Added factory constructor
TextIndexer.stream. - Changed
TextIndexer.indexText
Updated #
- Dependencies.
- Tests.
- Examples.
- Documentation.
0.16.0 #
Breaking changes #
- Default k-gram length changed from k =2 to k = 2 in
AsyncCallbackIndexandInMemoryIndexconstructors and
New #
- Unnamed factory constructor
InvertedIndexreturns a [AsyncCallbackIndex] instance. - Factory constructor
InvertedIndex.inMemoryreturns a [InMemoryIndex] instance.
Updated #
- Dependencies.
- Tests.
- Documentation.
0.15.0 #
Breaking changes #
- Renamed the following typedefs:
DictionarytoDftMap;DictionaryEntrytoDftMapEntry;DictionaryLoadertoDftMapLoader;DictionaryUpdatertoDftMapUpdater;DictionaryLengthLoadertoVocabularySize;KGramIndextoKGramsMap;KGramIndexLoadertoKGramsMapLoader;KGramIndexUpdatertoKGramsMapUpdater;PostingstoPostingsMap;PostingsEntrytoPostingsMapEntry;PostingsLoadertoPostingsMapLoader;PostingsUpdatertoPostingsMapUpdater;FieldPostingsEntrytoZonePostingsMapEntry;ZonePostingstoZonePostingsMap;DocumentPostingsEntrytoDocPostingsMapEntry; andDocumentPostingstoDocPostingsMap.
- Removed
HiveIndexfrom thetestfolder. - Removed
_asyncIndexerExamplefrom the `example folder. - Renamed the
text_indexing_extensionsmini-library toextensions. - Renamed the
text_indexing_type_definitionsmini-library totype_definitions.
Updated #
- Dependencies.
- Tests.
- Examples.
- Documentation.
0.14.7 #
0.14.0 #
Breaking changes #
- Removed class
TextSource. - Removed class
Sentence. - Removed class
TermPair. - Removed
TextAnalyzerConfiguration.sentenceSplitterfromTextAnalyzerConfigurationinterface. - Changed
TextTokenizer.tokenizereturn value toList<Token>. - Changed
TextTokenizer.tokenizeJsonreturn value toList<Token>. - Re-structured codebase. \
Updated #
- Dependencies.
- Tests.
- Examples.
- Documentation.
0.13.0 #
0.12.0+1 #
Updated dependencies and documentation.
0.12.0 #
BREAKING CHANGES
Breaking changes #
- Added method
InvertedIndex.getKGramIndextoInvertedIndexinterface. - Added method
InvertedIndex.upsertKGramIndextoInvertedIndexinterface. - Added field
InvertedIndex.ktoInvertedIndexinterface. - Removed field
TextIndexer.postingsStream. - Renamed method
TextIndexer.emittoTextIndexer.updateIndexes. - Added
AsyncIndex.k,AsyncIndex.kGramIndexLoaderandAsyncIndex.kGramIndexUpdaterfinal fields and parameters toAsyncIndexclass. - Added
InMemoryIndex.k, andInMemoryIndex.kGramIndexfinal fields and parameters toInMemoryIndexclass.
New: #
- Type alias
KGramIndex. - Type alias
KGramIndexLoader. - Type alias
KGramIndexUpdater. - Extension method
void KGramIndex.addTermKGrams(Term term, Iterable<KGram> kGrams).
Updated #
- Dependencies.
- Tests.
- Examples.
- Documentation.
0.11.0 #
0.10.0 #
Breaking changes #
TextIndexerBasedefault generative constructor is no longer markedconstas it has a method body that initializes listeners toTextIndexer.documentStreamandTextIndexer.collectionStream.
New: #
- Input stream fields
TextIndexer.documentStreamandTextIndexer.collectionStreamadded toTextIndexerinterface.- - Optional named parameter
Stream<Map<String, Map<String, dynamic>>>? collectionStreamadded to added toTextIndexer.async,TextIndexer.inMemoryandTextIndexer.indexfactory contructors.
Updated #
- Dependencies.
- Tests.
- Examples.
- Documentation.
0.9.0 #
Breaking changes #
- Renamed
InvertedPositionalZoneIndexinterface toInvertedIndex. - Renamed
TextIndexer.instancefactory toTextIndexer.index. - Parameter
dictionaryLengthLoaderadded toAsynCallbackIndexconstructor; - Parameter
dictionaryLengthLoaderadded toAsyncIndexerconstructor; - Parameter
dictionaryLengthLoaderadded toTextIndexer.asyncfactory constructor; - Removed class
InMemoryIndexer, use factory constructorTextIndexer.inMemoryin stead. - Removed class
AsyncIndexer, use factory constructorTextIndexer.asyncin stead.
New: #
- Type definition
FtdPostings. - Type definition
IdFtIndex. - Type definition
IdFt. - Type definition
ZoneWeightMap. - Field getter
Future<int> InvertedIndex.vocabularyLength. - Field getter
Future<int> Function() AsynCallbackIndex.dictionaryLengthLoader; - Field getter
int InvertedIndex.phraseLength. - Field getter
ZoneWeightMap InvertedIndex.zones. - Optional named parameter
ZoneWeightMap zonesadded toTextIndexer.asyncfactory. - Optional named parameter
ZoneWeightMap zonesadded toTextIndexer.inMemoryfactory. - Method
Future<FtdPostings> InvertedIndex.getFtdPostings(Iterable<Term>, int). - Method
Future<IdFtIndex> InvertedIndex.getIdFtIndex(Iterable<Term>). - Method
Future<Dictionary> InvertedIndex.getTfIndex(Iterable<Term>).
Updated #
- Dependencies.
- Tests.
- Examples.
- Documentation.
0.8.0+1 #
Updated dependencies
0.8.0 #
0.7.2+1 #
Updated dependencies
0.7.2 #
Updated dependencies
0.7.1 #
Updated dependencies
0.7.0 #
0.6.0 #
BREAKING CHANGES
Breaking changes #
- Changed signature of extension method
Postings.termPostingsList(Term)toPostings.termPostingsList([Iterable<Term>?]). - Removed field
InMemoryIndexer.dictionary. UseInMemoryIndexer.index.dictionaryinstead. - Removed field
InMemoryIndexer.postings. UseInMemoryIndexer.index.postingsinstead. - Removed method
TextIndexer.upsertDictionary. UseTextIndexer.index.upsertDictionaryinstead; - Removed method
TextIndexer.getDictionary. UseTextIndexer.index.getDictionaryinstead; - Removed method
TextIndexer.getPostings. UseTextIndexer.index.getPostingsinstead; - Removed method
TextIndexer.upsertPostings. UseTextIndexer.index.upsertPostingsinstead. - Removed field
InMemoryIndexer.dictionary. Useindex.dictionaryinstead. - Removed field
InMemoryIndexer.postings. Useindex.postingsinstead. - Added new field
InvertedIndex.analyzer, changing the signatures of factory constructorsTextIndexer.inMemoryand 'TextIndexer.async'.
Updated #
- Dependencies.
- Tests.
- Examples.
- Documentation.
0.6.0-2 #
0.5.0 #
BREAKING CHANGES
Deprecated:
- Field
InMemoryIndexer.dictionaryis deprecated. Useindex.dictionaryinstead. - Field
InMemoryIndexer.postingsis deprecated. Useindex.postingsinstead.
Updated #
- Dependencies.
- Tests.
- Examples.
- Documentation.
0.4.0 #
BREAKING CHANGES
Breaking changes #
- Renamed method
TextIndexer.indextoTextIndexer.indexText. - Renamed class
PersistedIndexertoAsyncIndexer.
New: #
InvertedIndexinterface and implementation.TextIndexer.indexfield getter.TextIndexer.indexfactory constructor.TextIndexer.asyncfactory constructor.TextIndexer.inMemoryfactory constructor.
Deprecated:
- Method
TextIndexer.upsertDictionaryis deprecated. UseTextIndexer.index.upsertDictionaryinstead; - Method
TextIndexer.getDictionaryis deprecated. UseTextIndexer.index.getDictionaryinstead; - Method
TextIndexer.getPostingsis deprecated. UseTextIndexer.index.getPostingsinstead; - Method
TextIndexer.upsertPostingsis deprecated. UseTextIndexer.index.upsertPostingsinstead. - Field
InMemoryIndexer.dictionaryis deprecated. Useindex.dictionaryinstead. - Field
InMemoryIndexer.postingsis deprecated. Useindex.postingsinstead.
Updated #
- Dependencies.
- Tests.
- Examples.
- Documentation.
0.3.2 #
0.3.1 #
0.2.0 #
BREAKING CHANGES
New: #
ZonePostings,DocumentPostings, andFieldPostingsEntrytype definitions.Ft,Pt,TermPositionsandDocIdtype aliases.- interface
Document.
Breaking changes #
- Replaced object-model class
PostingsEntrywith typedefPostingsEntry. - Replaced object-model class
DocumentPostingsEntrywith typedefDocumentPostingsEntry. - Replaced object-model class
DictionaryEntrywith typedefDictionaryEntry.
Restructured and simplified the codebase.
Updated #
- Dependencies.
- Tests.
- Examples.
- Documentation.
0.1.0 #
0.0.2 #
0.0.1+6 #
0.0.1 #
BREAKING CHANGES
Interfaces finalized (see breaking changes)
Breaking changes #
TermDictionaryrenamedDictionary.DocumentPostingsEntryrenamedPostings.PostingsMapEntryrenamedPostingsEntry.TermrenamedDictionaryEntry.TermPositionsrenamedDocumentPostingsEntry.AsyncIndexerimplementation.TextIndexerBaseimplementation.InMemoryIndexerimplementation.
Updated #
- Dependencies.
- Tests.
- Examples.
- Documentation.
0.0.1-beta.3 #
0.0.1-beta.2 #
0.0.1-beta.1 #
Initial version.