###Run GQFast **1. Download GQFast from the [download page](download.html)** GQFast has the following directory structures: ``` Data: all the data files should go here. E.g., DT.csv Configuration: all the configuration files go here. E.g., config_dt_doc.gqfast Index: the indices created by GQFast go here. E.g., index_dt_doc.gqfast MetaData: the meta-data files created by GQFast go here. E.g., meta_dt_doc.gqfast Query: all the queries should go here. E.g., SD.query Code: all the generated codes go here. Result: all the answers of queries go here. E.g., SD.result ``` Note that, users should put the data files in *Data* directory, configuration files in *Configuration* directory and queries in *Query* directory. **2. Run GQFast three scripts** GQFast provides the following three scripts: * gqfast_init: it has no input parameters. it checks environment and initializes settings ```javascript ./gqfast_init ``` * gqfast_load: it has two input parameters, data file name and configuration file name. ``` ./gqfast_load data-file-name.csv configuration-file-name.xml ``` * gqfast_execute: it has one input parameter, query name ``` ./gqfast_execute query-name ``` <hr width="100%"> ###Environment GQFast runs on Linux systems. It has been fully tested on Ubuntu 14.04. It also needs the following two supports. - Boost C++ Libraries. ``` sudo apt-get install libboost-all-dev ``` - Java 1.8+ ``` sudo add-apt-repository ppa:webupd8team/java sudo apt-get update sudo apt-get install oracle-java8-installer ``` <hr width="100%"> ### Build indice via gqfast_load As shown in the following figure, users submit a data file and a configuration file to the **GQ-Fast Loader**. Then the **GQ-Fast Loader** creates a GQ-Fast index and some associated meta-data. <img src="../img/loader.png" width="700" height="300"><br> The input data file is a CSV file and the configuration file is an XML file. An example data file is as follows: ``` Doc,Term,Fre 1,35,9 1,920,10 1,931,13 1,2796,12 1,4490,12 1,4970,10 1,24945,15 1,187879,10 2,1,3 2,535,8 2,4970,12 ...... ``` An example config.xml file is shown below: ``` <?xml version="1.0" encoding="UTF-8"?> <!-- The default encoding method is uncompressed array, which users do not need to specify. GQ-Fast currently also support the following encoding methods: - BCA: bit-aligned comprssed array - BB: byte-aligned compressed bitmap - HU: huffman encoding --> <configuration> <table_name>DT</table_name> <index_column_id>0</index_column_id> <encodings> <column_id_type>1,BCA</column_id_type> <column_id_type>2,BB</column_id_type> </encodings> </configuration> ``` The output index and meta-data are stored in disk, which can be serialized. <br> The index format can be found in <a href="https://arxiv.org/pdf/1602.00033v3.pdf" target="_blank">our paper</a> (Section 5). <br> The meta-data contains the following information.<br> <ul> <li> For each column, GQ-Fast records its domain size and calculates the minimal number of bytes needed, the minimal . <li> For the whole table, GQ-Fast records the number of encoded columns, the maximal size of the fragments and also the type of it. </ul> The format of meta-data is shown below: ``` line 1: table name line 2: lookup column name line 3: num encodings (total cols -1) line 4: column names line 5: domains line 6: column min values line 7: column byte sizes line 8: column encoding flags line 9: max fragment size line 10: flag of table type ``` The encoding flags are defined below: ``` 1: UA (Uncompressed Array, a default setting) 2: BCA (Bit-aligned Compressed Array) 3: BB (Byte-aligned Bitmap) 4: HU (Huffman) ``` An example meta-data is below: ``` dt doc 2 doc,term,fre 5001,751,255 0,1,1 4,4,1 0,2,4 133 0 ``` <hr width="100%"> ###Run query via gqfast_execute gqfast_execute contains three steps: 1. parse input query to generate C++ code 2. compile generated C++ code 3. run C++ code on indices to get answers <!-- ####Parse query Execute the following command to parse a query ``` javac gqfast_parser.java java gqfast_parser -q query_filename -m meta_filename -n query_name ``` The input of *GQ-Fast Parser* is a query file, a query name, and a meta-data file, which is one of the outputs of the loader. The output of *GQ-Fast Parser* is a generated code and a executor_settings file, which includes a mapping from table names to index names. The format of the executor_settings file is shown below: ``` line 1: total number of indices related to the query, say it is n. line 2: domain size of aggregation column line 3 - line (n+2): index name ``` An example executor_settings file is shown below: ``` 3 751 dt_doc dt_term da_doc ``` <hr width="100%"> ### GQ-Fast Executor The **GQ-Fast Executor** has three steps: (1) compile generated code, (2) compile executor, and (3) run executor **compile generated code** ``` g++ -std=c++11 -fPIC -c generatedcode_filename.cpp g++ -shared -o queryname.so generatoredcode_filename.o ``` **compile executor** ``` g++ -std=c++11 gqfast_executor.cpp -ldl -pthread g++ -o gqfast_run_query gqfast_executor.o -ldl ``` **run executor** ``` ./gqfast_run_query -s executor_settings_filename -q queryname.so ``` Results are sorted in descending order. Results are written to a file query_name.result in the *GQFast/Result/* directory. -->