libept
|
Maintains and accesses a Xapian index of package descriptions. More...
#include <textsearch.h>
Classes | |
struct | ExtraIndexer |
Public Member Functions | |
TextSearch () | |
Xapian::Database & | db () |
Access the Xapian database. | |
const Xapian::Database & | db () const |
Access the Xapian database. | |
time_t | timestamp () const |
Timestamp of when the Xapian database was last updated. | |
bool | hasData () const |
Returns true if the index has data. | |
bool | needsRebuild (apt::Apt &apt) |
Returns true if the index is older than the Apt database information. | |
bool | rebuildIfNeeded (apt::Apt &apt, const std::vector< const ExtraIndexer * > &extraIndexers=std::vector< const ExtraIndexer * >()) |
Rebuild the index if needed. | |
Xapian::docid | docidByName (const std::string &pkgname) const |
Retrieve a Xapian docid by package name. | |
Xapian::Query | makeORQuery (const std::string &keywords) const |
Tokenize the string and build an OR query with the resulting keywords. | |
Xapian::Query | makePartialORQuery (const std::string &keywords) const |
Tokenize the string and build an OR query with the resulting keywords. | |
template<typename ITER > | |
Xapian::Query | makeORQuery (const ITER &begin, const ITER &end) const |
Build a query with the given keywords, specified as iterators of strings. | |
std::vector< std::string > | expand (Xapian::Enquire &enq) const |
Return a list of tag-based terms that can be used to expand an OR query. | |
Xapian::Query | makeRelatedQuery (const std::string &pkgname) const |
Create a query to look for packages similar to the given one. | |
double | getDoubleValue (const std::string &pkgname, Xapian::valueno val_id) const |
Get the integer value for. | |
int | getIntValue (const std::string &pkgname, Xapian::valueno val_id) const |
Get the integer value for. | |
Protected Member Functions | |
void | normalize_and_add (Xapian::Document &doc, const std::string &term, int &pos) const |
Add normalised tokens computed from the string to the document doc. | |
Static Protected Member Functions | |
static std::string | toLower (const std::string &str) |
Return a lowercased copy of the string. | |
Protected Attributes | |
time_t | m_timestamp |
Xapian::Database | m_db |
Xapian::Stem | m_stem |
Maintains and accesses a Xapian index of package descriptions.
Contrarily to Debtags and Popcon, TextSearch does not attempt to create the index in the home directory if no system index is found and it is not running as root: this is to avoid secretly building large indexes (>50Mb) in the home directory of users.
The idea then is to have root keep the index up to date, possibly running a reindexing tool once a day, or after an apt-get update.
This works because the full text search index is useful even if it is slightly out of date.
References ept::textsearch::Path::index(), ept::textsearch::Path::indexTimestamp(), m_db, and m_timestamp.
Xapian::Database& ept::textsearch::TextSearch::db | ( | ) | [inline] |
Access the Xapian database.
References m_db.
Referenced by getDoubleValue(), getIntValue(), and makeRelatedQuery().
const Xapian::Database& ept::textsearch::TextSearch::db | ( | ) | const [inline] |
Access the Xapian database.
References m_db.
Xapian::docid ept::textsearch::TextSearch::docidByName | ( | const std::string & | pkgname | ) | const |
Retrieve a Xapian docid by package name.
References m_db.
Referenced by getDoubleValue(), and getIntValue().
vector< string > ept::textsearch::TextSearch::expand | ( | Xapian::Enquire & | enq | ) | const |
Return a list of tag-based terms that can be used to expand an OR query.
double ept::textsearch::TextSearch::getDoubleValue | ( | const std::string & | pkgname, |
Xapian::valueno | val_id | ||
) | const |
Get the integer value for.
References db(), and docidByName().
Referenced by TestTextsearch::values().
int ept::textsearch::TextSearch::getIntValue | ( | const std::string & | pkgname, |
Xapian::valueno | val_id | ||
) | const |
Get the integer value for.
References db(), and docidByName().
bool ept::textsearch::TextSearch::hasData | ( | ) | const [inline] |
Xapian::Query ept::textsearch::TextSearch::makeORQuery | ( | const std::string & | keywords | ) | const |
Tokenize the string and build an OR query with the resulting keywords.
Referenced by makePartialORQuery().
Xapian::Query ept::textsearch::TextSearch::makeORQuery | ( | const ITER & | begin, |
const ITER & | end | ||
) | const [inline] |
Xapian::Query ept::textsearch::TextSearch::makePartialORQuery | ( | const std::string & | keywords | ) | const |
Tokenize the string and build an OR query with the resulting keywords.
The last token in keywords is considered to be typed only partially, to implement proper search-as-you-type.
References m_db, and makeORQuery().
Xapian::Query ept::textsearch::TextSearch::makeRelatedQuery | ( | const std::string & | pkgname | ) | const |
Create a query to look for packages similar to the given one.
References db().
bool ept::textsearch::TextSearch::needsRebuild | ( | apt::Apt & | apt | ) |
Returns true if the index is older than the Apt database information.
References m_timestamp, and ept::apt::Apt::timestamp().
Referenced by TestTextsearch::empty().
void ept::textsearch::TextSearch::normalize_and_add | ( | Xapian::Document & | doc, |
const std::string & | term, | ||
int & | pos | ||
) | const [protected] |
bool ept::textsearch::TextSearch::rebuildIfNeeded | ( | apt::Apt & | apt, |
const std::vector< const ExtraIndexer * > & | extraIndexers = std::vector<const ExtraIndexer*>() |
||
) |
Rebuild the index if needed.
Allow to specify functors that contribute to the indexing.
References ept::apt::PackageRecord::description(), ept::textsearch::Path::index(), ept::apt::PackageRecord::installedSize(), m_db, m_stem, m_timestamp, ept::textsearch::max_index, ept::apt::PackageRecord::package(), ept::apt::PackageRecord::packageSize(), ept::apt::Apt::recordBegin(), ept::apt::Apt::recordEnd(), ept::apt::RecordParser::scan(), ept::textsearch::Path::setTimestamp(), ept::apt::PackageRecord::tag(), ept::apt::Apt::timestamp(), ept::textsearch::VAL_APT_INSTALLED_SIZE, and ept::textsearch::VAL_APT_PACKAGE_SIZE.
time_t ept::textsearch::TextSearch::timestamp | ( | ) | const [inline] |
Timestamp of when the Xapian database was last updated.
References m_timestamp.
Referenced by TestTextsearch::empty().
std::string ept::textsearch::TextSearch::toLower | ( | const std::string & | str | ) | [static, protected] |
Return a lowercased copy of the string.
Referenced by makeORQuery(), and normalize_and_add().
Xapian::Database ept::textsearch::TextSearch::m_db [protected] |
Referenced by db(), docidByName(), makePartialORQuery(), rebuildIfNeeded(), and TextSearch().
Xapian::Stem ept::textsearch::TextSearch::m_stem [protected] |
Referenced by makeORQuery(), normalize_and_add(), and rebuildIfNeeded().
time_t ept::textsearch::TextSearch::m_timestamp [protected] |
Referenced by hasData(), needsRebuild(), rebuildIfNeeded(), TextSearch(), and timestamp().