Giter Club home page Giter Club logo

editsql's Introduction

EditSQL for Spider, SParC, CoSQL

This is a pytorch implementation of the CD-Seq2Seq baseline and the EditSQL model in the following papers

Please cite the papers if you use our data and code.

@InProceedings{yu2018spider,
    title = "Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task",
    author = "Tao Yu, Rui Zhang, Kai Yang, Michihiro Yasunaga, Dongxu Wang, Zifan Li, James Ma, Irene Li, Qingning Yao, Shanelle Roman, Zilin Zhang, Dragomir Radev",
    booktitle = "Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing",
    year = "2018",
    address = "Brussels, Belgium"
}

@InProceedings{yu2019sparc,
  author =      "Tao Yu, Rui Zhang, Michihiro Yasunaga, Yi Chern Tan, Xi Victoria Lin, Suyi Li, Heyang Er, Irene Li, Bo Pang, Tao Chen, Emily Ji, Shreya Dixit, David Proctor, Sungrok Shim, Jonathan Kraft, Vincent Zhang, Caiming Xiong, Richard Socher, Dragomir Radev",
  title =       "SParC: Cross-Domain Semantic Parsing in Context",
  booktitle =   "Proceedings of The 57th Annual Meeting of the Association for Computational Linguistics",
  year =        "2019",
  address =     "Florence, Italy"
}

@InProceedings{yu2019cosql,
  author =      "Tao Yu, Rui Zhang, He Yang Er, Suyi Li, Eric Xue, Bo Pang, Xi Victoria Lin, Yi Chern Tan, Tianze Shi, Zihan Li, Youxuan Jiang, Michihiro Yasunaga, Sungrok Shim, Tao Chen, Alexander Fabbri, Zifan Li, Luyao Chen, Yuwen Zhang, Shreya Dixit, Vincent Zhang, Caiming Xiong, Richard Socher, Walter Lasecki, Dragomir Radev",
  title =       "CoSQL: A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases",
  booktitle =   "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
  year =        "2019",
  address =     "Hong Kong, China"
}

@InProceedings{zhang2019editing,
  author =      "Rui Zhang, Tao Yu, He Yang Er, Sungrok Shim, Eric Xue, Xi Victoria Lin, Tianze Shi, Caiming Xiong, Richard Socher, Dragomir Radev",
  title =       "Editing-Based SQL Query Generation for Cross-Domain Context-Dependent Questions",
  booktitle =   "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
  year =        "2019",
  address =     "Hong Kong, China"
}

Contact Rui Zhang for any question.

Dependency

The model is tested in python 3.6 and pytorch 1.0. We recommend using conda and pip:

conda create -n editsql python=3.6
source activate editsql
pip install -r requirements.txt

Download Pretrained BERT model from here as model/bert/data/annotated_wikisql_and_PyTorch_bert_param/pytorch_model_uncased_L-12_H-768_A-12.bin.

Download the database sqlite files from here as data/database.

Run Spider experiment

First, download Spider. Then please follow

  • run_spider_editsql.sh. We saved our experimental logs at logs/logs_spider_editsql. The dev results can be reproduced by test_spider_editsql.sh with the trained model downloaded from here and put under logs/logs_spider_editsql/save_12.

This reproduces the Spider result in "Editing-Based SQL Query Generation for Cross-Domain Context-Dependent Questions".

Dev Test
EditSQL 57.6 53.4

Run SParC experiment

First, download SParC. Then please follow

  • use cdseq2seq: run_sparc_cdseq2seq.sh. We saved our experimental logs at logs/logs_sparc_cdseq2seq
  • use cdseq2seq with segment copy: run_sparc_cdseq2seq_segment_copy.sh. We saved our experimental logs at logs/logs_sparc_cdseq2seq_segment_copy
  • use editsql: run_sparc_editsql.sh. We saved our experimental logs at logs/logs_sparc_editsql. The dev results can be reproduced by test_sparc_editsql.sh with the trained model downloaded from here and put under logs/logs_sparc_editsql/save_31_sparc_editsql.

This reproduces the SParC result in "Editing-Based SQL Query Generation for Cross-Domain Context-Dependent Questions".

Question Match Interaction Match
Dev Test Dev Test
CD-Seq2Seq 21.9 - 8.1 -
CD-Seq2Seq+segment copy (use predicted query) 21.7 - 9.5 -
CD-Seq2Seq+segment copy (use gold query) 27.3 - 10.0 -
EditSQL (use predicted query) 47.2 47.9 29.5 25.3
EditSQL (use gold query) 53.4 54.5 29.2 25.0

Run CoSQL experiment

First, download CoSQL. Then please follow

  • run_cosql_cdseq2seq.sh. We saved our experimental logs at logs/logs_cosql_cdseq2seq.
  • run_cosql_editsql.sh. We saved our experimental logs at logs/logs_cosql_editsql. The dev results can be reproduced by test_cosql_editsql.sh with the trained model downloaded from here and put under logs/logs_cosql_editsql/save_12_cosql_editsql.

This reproduces the SQL-grounded dialog state tracking result in "CoSQL: A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases".

Question Match Interaction Match
Dev Test Dev Test
CD-Seq2Seq 13.8 13.9 2.1 2.6
EditSQL 39.9 40.8 12.3 13.7

Run ATIS experiment

To get ATIS data and get evaluation on the result accuracy, you need get ATIS data from here, set up your mysql database for ATIS and change --database_username and --database_password in parse_args.py.

Please follow run_atis.sh

This reproduces the ATIS result in "Learning to map context dependent sentences to executable formal queries". We saved our experimental logs at logs/logs_atis

Dev Test
Query Relaxed Strict Query Relaxed Strict
Suhr et al., 2018 37.5(0.9) 63.0(0.7) 62.5(0.9) 43.6(1.0) 69.3(0.8) 69.2(0.8)
Our Replication 38.8 63.3 62.8 44.6 68.3 68.2

Note

  • Some interactions can be long to cause OOM issue on GPU. You can skip those interactions like this depending on your hardware.

  • EditSQL does not predict values in SQL. In order to generate values, the model should be able to (1) copy from the question inputs, (2) retrieve from the database content (database content is available), (3) generate numbers (e.g. 1 or 3 after LIMIT in some SQL queries). More information is available at Spider.

Acknowledgement

This implementation is based on "Learning to map context dependent sentences to executable formal queries". Alane Suhr, Srinivasan Iyer, and Yoav Artzi. In NAACL, 2018.

editsql's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

editsql's Issues

Beam search decoding?

Hi, I was wondering if you support beam search decoding. I tried to check the code for that but could not find any relevant parts. If you support it, would you please point me to the relevant part in the code?
Thanks,

Train and dev pickle files not found

I am trying to run this model using test_spider_editsql.sh command and I am getting the following error:
FileNotFoundError: [Errno 2] No such file or directory: 'processed_data_spider_removefrom/train.pkl'

The test files are looking for pickle files which are not present in the repo? Could you please look into this?

Batch training for spider dataset

Hi,

I notice that in the training bash for spider, the interaction_level was set for true.
Should I run spider (actually no interaction) in this setting ?
Since the current setting is:
a) batch-size was set to 1 in run.py (L48)
b) interaction_level

Is batch training available in interaction level using schema_interaction_model?

Thanks in advance!

Error on running `run_sparc_editsql.sh`

Hi, I am getting this error when I try to run run_sparc_editsql.sh

FileNotFoundError: [Errno 2] No such file or directory: 'logs_sparc_editsql/valid_use_predicted_queries_predictions.json'

Also a warning prior to that:
Warning: arguments already exist in logs_sparc_editsql/args.log

I am following your steps in COLAB. Everything is installed and placed in mentioned folders.

How to remove OOV and infer text directly to sql

While running run_sparc_editsql.sh script on my added data I am facing the following issue.

read_data_json data/sparc/train_no_value.json 3034
read_data_json data/sparc/dev_no_value.json 423
continue
OOV! Doc_name
continue

and I am not able to infer a direct query to SQL.
Any suggestions thanks.

About word_emb file

I'm trying to run run_cossql_cdseq2seq.sh but I found no glove.840B.300d.txt file in this git project. how can i get this file?

how to run editsql on cosql?

I run the run_cosql_cdseq2seq.sh follow the readme, but I can't get the log file like logs/logs_cosql_editsql/log.txt. What should I change?

what is the removefrom

Excuse me, I find that when model training, remove from corpus will be used.
but i am doubt that, whether it will loss corpus information and incur error?
For example, the first utterance of 35 interaction is:
"utterance": "how many dorms have a TV Lounge ?",
and the original sql is
"select count ( * ) from dorm as t1 join has_amenity as t2 on t1 . dormid = t2 . dormid join dorm_amenity as t3 on t2 . amenid = t3 . amenid where t3 . amenity_name = value"
after removefrom, it becomes:
"select count ( * ) where dorm_amenity.amenity_name = value"

it seems that the removefrom sequence miss much information and not equal to the original SQL sequence

so my question is whether original SQL and removefrom SQL are equivalent, (i mean, whether they can be converted one-to-one). And has anyone (paper/article) done this in the same way before?

problems with run_sparc_edistql.sh and test_sparc_editsql.sh

Hi,

I am trying to run the run_sparc_editsql.sh and test_sparc_editsql.sh, I already installed all the requirements as suggested by the GitHub and by the following page: https://towardsdatascience.com/natural-language-to-sql-use-it-on-your-own-database-d4cd5784d081, I am using colab in order to work with a GPU.
I downloaded the glove.840B.300d.txt file and changed the path in the run_sparc_editsql.sh file and in the test_sparc_editsql.sh

When I run: bash run_sparc_editsql.sh initially I have the following error: No such file or directory: 'logs_sparc_editsql/valid_use_predicted_queries_predictions.json’, so I copied the given file which is present in logs/logs_sparc_editsql and paste it in the generated folder of logs_sparc_editsql. Then I try to run again the run_sparc_editsql.sh file and this time I have the following message:

File "/content/gdrive/My path/editsql/model/model.py", line 91, in
embedding = np.array([float(val) for val in l_split[-embedding_size:]])
ValueError: could not convert string to float: 'squeeze2’.

Moreover If I then try to run the test_sparc_editsql.sh I have the following error.

RuntimeError: cuda runtime error (38) : no CUDA-capable device is detected at /pytorch/aten/src/THC/THCGeneral.cpp:51
(But I checked and CUDA seems to be available)

What can I do to solve the following problems and successfully run the code?

Thank you in advance.

Testing on custom questions/queries

I was able to train/test on the given Spider and "spider_data_removefrom" sets but I want to test the model on a given question-query. I tried editing the queries in the dev.json file but the predictions remain the same as the queries in the original dev.json. Can you help me out?

[Question] Did you set the SOTA on spider?

Given the name, I was wondering if the current SOTA on spider (RYANSQL + BERT, 12. November) is your work? And if so, was it based on the editsql model?

Many thanks for satisfy my curiosity :-)

How do I make inferences on custom queries?

Right now, when I try to put just the question text without the "sql", "query", "query_toks" & "query_toks_no_value" keys in the dev.json, I get an error when running the run_spider.sh and test_spider.sh .

I want to understand how can I make predictions (in an inferencing environment) on my sample queries?

Someone please help me out here.

Some questions about repeating the results on SParC.

I want to reproduce the result of SParC on my server, and i only download the code and run the run.sh following the README. The model 'saved_42' is chosen to get the final result.
2020-03-11 22-10-57屏幕截图
But the final result is only 46.0, which is lower than the results in the paper. Although I repeat a lot of times, I get some similar results like 45.8 or 45.5.
2020-03-11 22-19-52屏幕截图
Maybe are the experimental environments different? or you use some the different parameters, like learning rate or something.
I hope you can give me some suggestions.

verify results

Hi,

Thanks for the code.

I ran the test_sparc_editsql.sh script, now I see
valid gold-passing STRING_ACCURACY: 53.78
in log.txt
Is this same as EditSQL (use gold query) | 53.4 ?
Also, can you please tell me where is interaction match accuracy is calculated?

Thanks a lot

Retraining on spider dataset doesn't reproduce results, `valid gold-passing STRING_ACCURACY = 0.00`.

I am retraining the model using run_spider_edisql, on the spider dataset.
These are the logs created

Original number of training utterances:	8642
Actual number of used training examples:	8642
(Shortened by output limit of 200)
Number of steps per epoch:	8642
Batch size:	1
Epoch:	0
train epoch loss:	0.044776277615753715
train final gold-passing LOSS:	29.60
train final gold-passing TOKEN_ACCURACY:	92.11
train final gold-passing STRING_ACCURACY:	57.00
valid gold-passing LOSS:	322.41
valid gold-passing TOKEN_ACCURACY:	50.00
valid gold-passing STRING_ACCURACY:	0.00
countdown:	9

Epoch:	1
train epoch loss:	0.01979759320021786
train final gold-passing LOSS:	19.82
train final gold-passing TOKEN_ACCURACY:	94.59
train final gold-passing STRING_ACCURACY:	64.00
valid gold-passing LOSS:	534.82
valid gold-passing TOKEN_ACCURACY:	50.00
valid gold-passing STRING_ACCURACY:	0.00
learning rate coefficient:	0.8
countdown:	8

Epoch:	2
train epoch loss:	0.012703387576976938
train final gold-passing LOSS:	13.85
train final gold-passing TOKEN_ACCURACY:	96.08
train final gold-passing STRING_ACCURACY:	74.00
valid gold-passing LOSS:	363.45
valid gold-passing TOKEN_ACCURACY:	50.00
valid gold-passing STRING_ACCURACY:	0.00
countdown:	7
........

It ran for 10 epochs, but the valid gold-passing TOKEN_ACCURACY and valid gold-passing STRING_ACCURACY remain constant with 50.00 and 0.00 respectively.

getting actual values in predicted SQL queries

Hello,
I find that all the values in the input questions are replaced by 1 in the SQL query.

For example, if my input is (these examples are just illustrative, not real):

"Get the top 10 samples in column_x"
This top 10 query is translated to "SELECT column_x FROM table ORDER BY column_x DESC LIMIT 1"
So it replaced top 10 by top 1

Also in conditions:
"List a column X where column Y < 0.1"
Is translated to
"SELECT column_x FROM table WHERE column_y < 1"
Again 0.1 is replaced by 1

Do you have a mechanism to fix this?

About the GPU resource

Hello! I'm now trying to run this code on CoSQL datasets but meet some problems.
I use the default parameters and my 16G GPU get out of memory. I have tried some adjustment but it didn't work. For example, batch_size is set to 16 in the default, and I still get out of memory even I change it to 1.(After I read codes briefly, I found that this parameter will be overwritten by interaction_level. It is always 1 so it didn't work.)
Could you please share something you know, like how to reduce the GPU needs, or point out that I have made some mistakes.

I was wondering how to repeat your experiment

Hi,
I was wondering whether you could provide guidance to repeat your experiment.
I can not generate the final output to evaluate the results.
It seems the ``eval_step'' function in the model is deleted.
So how can we generate the final SQL output through inputting the dev.json?
I can only get 23.60% in valid gold-passing STRING_ACCURACY without BERT.
Maybe I make some mistakes or it may improve when evaluating it in Spider exact match evaluation method.
So I was wondering what I should do to get the same results as your paper.
Thank you.

Confused about the Result

Thank you for open sourcing your baselines and the awesome dataset :) I have a question about the performance of CD-seq2seq. In the paper and the leaderboard, the result is as follow:

image

But in the README.md file, I find it higher(almost improve 5% on question match result in dev) than the reported result.
image

So I want to know which one is correct? Or, does the latest result in the README follow the setting in the paper? Thanks for your reply.

Licensing

What type of license do you have for this code? Is it Apache 2.0?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.