cmu-db / ottertune Goto Github PK

View Code? Open in Web Editor NEW

1.2K 1.2K 308.0 3.96 MB

The automatic DBMS configuration tool

License: Other

Python 68.61% CSS 0.80% JavaScript 2.50% HTML 4.65% Java 21.79% Shell 1.65%

ottertune's People

Contributors

Stargazers

Watchers

Forkers

ngaut ozgune emrul luan-cestari dantodor toast707 liuqian1990 arnons1 ambier a58457432 royshan wellsoftware sniperxiaojun nunb brianhu2006 ruiaylin anilyadav fanlifei bradbann dgary1980 ankitlohia b-xiang sunhongkun2013 bakahr xionun zwlssdut popovnv ericweijieli booom kinghows hadeshacker hassaku63 sjanulonoks china-king-arthur kohuang fjfd smallfang2008 wtx626 ktaranov staysun jerrychu888 is00hcw dikang123 hongshunyang rahulravindren jogging520 gooops lilf qwdingyu anchenxdy ring2003 tangyiyong shaode debugger87 zhuwenxiao threefoldo justhonor ybbid yucl80 atharvg timeahead singlemice sgp715 gnuwind live0717 showsmall xlogin allen12921 wulinjun4 root2000xyz liaozhuoran flyh2004 fromaguicode samratp-zz silveringsea crimsonalchemist nanbo99 jeffrey820421 xixicat dvanaken leeleyok junli0411 swimj coderjiav suafeng fullstackhan selwyni louishust fastzch ssyssy aboussetta 3838438org newopscn zpyovo databaseselection mathxingkong wushuang1 penggitclone ailinkid linpingchuan

ottertune's Issues

ConversionUtil

ConversionUtil.get_human_readable essentially passes on the value we're converting and some system (list of tuples with numeric value for the suffix and said suffix) to hurry.filesize size function. The issue is in the conversion to the POSTGRES_TIME_SYSTEM in PostgresUtilImpl.

The POSTGRES_TIME_SYSTEM is currently in alphabetical order of the abbreviations - ['d', 'hr', 'min', 'ms', 's']. This poses an issue for the size function since it will convert the numeric value to the first unit that has a numeric value less than the one passed in.
i.e. - 55000 converted to time becomes "55000ms" rather than "55s".

The quick fix for this is to reorder the POSTGRES_TIME_SYSTEM into ['d', 'hr', 'min', 's', 'ms']. However, this causes another issue - the current implementation of ConversionUtil.get_raw_size tries to parse '1000ms' into '1000m' seconds and throws an ValueError.

This can be fixed with a try-except in get_raw_size in addition to the reordering.

dependency is not complete

from dbms.param import ConfigManager
can not find file dbms

deploy?

hi, Can you provide a detailed deployment document?
It's best to use a database case to deploy and explain as you can, and tell us how to use it
thank you！！！

Max Value

We need to set max values for some knobs in different hardwares. For example, the buffer memory cannot exceeds the total memory of the machine.

We can get the hardware information (e.g. memory, cpu, storage info) from here.

ottertune/server/website/website/models.py

Line 157 in 3022061

class Hardware(BaseModel):

We select a hardware type when creating a session. Then we can get the hardware info like memory.
When tuning memory-related knobs, we do not want to exceeds the hardware memory. X_max values can be set here

ottertune/server/website/website/tasks/async_tasks.py

Line 300 in 3022061

model.fit(X_scaled, y_scaled, X_min, X_max, ridge)

The first step is to add KnobResourceType in the types file ottertune/server/website/website/types.py
1: Memory , 2: CPU, 3: Storage, 4:Other
Then add the resource attribute in the KnobCatalog model

ottertune/server/website/website/models.py

Line 69 in 3022061

class KnobCatalog(BaseModel):

I have selected some memory-related knobs for Postgres, you can add the info in the knob fixture ottertune/server/website/website/fixtures/postgres-96_knobs.json, and reload the data.

For now we can only support memory. The following knobs type should be set 1(i.e. Memory), the others can be set 4(Others) in the knob fixture.

global.work_mem
global.shared_buffers
global.temp_buffers
global.wal_buffers
global.maintenance_work_mem
global.effective_cache_size

Update Parser Test

ottertune/server/website/tests/test_parser.py

Line 454 in 03c4801

def test_create_knob_configuration(self):

We need to update the test above because the create_knob_config function has been changed

Fix target objective

Only show the selected target objective instead of all target objectives in the session view. e.g. when the user selects throughput as the target objective, then the latency should not be shown in the session view. Otherwise it gets broken because it does not have latency data.

Encode categorical variables

When tuning knobs that are categorical variables (enums), OtterTune currently just replaces the enum with its index in the list of all possible enums. We need to add support to correctly handle categorical variables before running the machine learning algorithms by using an encoding scheme (e.g., dummy encoding, one-hot encoding).

Support other Objective Functions

Now we hardcode the objective function as throughput. See:

ottertune/server/website/website/parser/base.py

Line 160 in 8c0948d

if self.transactions_counter not in metric_data:

and

ottertune/server/website/website/parser/postgres.py

Line 69 in 8c0948d

def transactions_counter(self):

Also, we use the gradient descent to minimize -throughput.

ottertune/server/website/website/tasks/async_tasks.py

Line 287 in 8c0948d

# FIXME: Maximize the throughput, hardcode

We want to remove the hardcode part, and support other objective functions like latency (less is better), and use gradient descent to minimize these less-is-better objectives. Users can choose the objective function from the list (for now throughput and latency are enough). You may want to change the above files. The following files may also be changed.

ottertune/server/website/website/models.py

Line 90 in 8c0948d

class MetricManager(models.Manager):

ottertune/server/website/website/templates/session.html

Line 47 in 8c0948d

<td><div class="text-right">{{ labels.target_objective }}</div></td>

there is missing some file in code

hi, i read your code and found some file missing,for example, analysis.preprocessing

there exists

from .util import NEARZERO
from common.typeutil import is_numeric_matrix, is_lexical_matrix

but i can't find file util and typeutil

LabelUtil outputs are have inconsistent encodings

Use of LabelUtil's style_util method with a LabelStyle.CAPFIRST outputs labels in unicode. Other LabelStyles output standard Python 2.x strings.

To be consistent, all label styles should output Unicode.

speed up the website

The website is too slow when many results are uploaded. When uploading ~1000 results, the page response time is ~20 seconds. We need to speed up the website by optimizing Django code (e.g. use cache in Django). The response time can be reduced to less than 1 sec from my previous experience.

support boolean/binary categoricals

PR #88 and #131 add support for categorical variables when the variable can take at least 3 values.

There is no reason the same code cannot work for 2 values, except that dummy encoding a boolean just adds a redundant column.

The hacky way is to just take what's in the above PRs and turn off the check that the variable take at least 3 values. Unless that's actually fine to do, I plan to re-add support for binary categoricals that is both correct and doesn't require dummy encoding.

Replace old poster lib by modern requests

Requests is more mature and stable library, besides supporting a more rich set of features than poster. In addition, poster doesn't support Python3, so it would be nice to replace it.

User Define Function - Server side

Add a target_objective.json file which has information about latency and its value. Upload it together with other four files (knobs.json, metrics_before/after, summary.json) to the tuner. Tuner needs to parse the file and gets the value of target objective.

ottertune/server/website/website/views.py

Line 384 in eeebeb9

def handle_result_files(session, files):

You may also want to modify the parser .

ottertune/server/website/website/parser/postgres.py

Line 73 in eeebeb9

def latency_timer(self):

Test your code locally to make sure it can tune latency successfully.

Website Layout & time

For now we show all the plot metrics on the left side, which is too long. We want to divide them into different pages like the right-side filtered results.

ottertune/server/website/website/templates/session.html

Line 108 in 8c0948d

upload_data.py not working

When I and Bohan tried to upload data using upload_data.py, we encountered following error.

ubuntu@ubuntu-xenial:/ottertune/server/website$ python script/controller_simulator/upload_data.py script/controller_simulator/generated_data 0987654321
Traceback (most recent call last):
File "script/controller_simulator/upload_data.py", line 64, in
main()
File "script/controller_simulator/upload_data.py", line 60, in main
upload(args.datadir, args.upload_code)
File "script/controller_simulator/upload_data.py", line 30, in upload
for wkld_dir in sorted(glob.glob(os.path.join(basedir, '*'))):
File "/usr/lib/python2.7/posixpath.py", line 70, in join
elif path == '' or path.endswith('/'):
AttributeError: 'list' object has no attribute 'endswith'

How we can install this tool and can we use for mysql

Hi ,

Is ottertune is ready for production use.

Change Celery log path

Celery and Django have the same log file website.log, it may cause some permission issues when restarting Celery worker or Django. For now we can grant permissions for /tmp/log/website.log* on ottertune machine when encountering such problem. It is not urgent, and I will change the Celery log file to celery.log when I am free.

Invalid target objective in tuning session crashes session page

During the creation or edition of a session, if a non existent target objective is informed as below:

Then the session page crashes. Also, if the user doesn't inform a target objective when tuning session is true, the table page gets a bit screwed up.

Support tuning without previous data

Right now the automatic tuning component fails if there is no knob/metric data in its repository. To add support for tuning with an empty repository, OtterTune should start collecting some initial data so that it can train its ML models, then proceed as usual after it has collected enough training data.

Ottertune doesn't start due to lib missing

When trying to run ottertune server it blows up with the following message:

  File "/Users/edward/.virtualenvs/ottertune/lib/python2.7/site-packages/django/db/backends/mysql/base.py", line 28, in <module>
    raise ImproperlyConfigured("Error loading MySQLdb module: %s" % e)
django.core.exceptions.ImproperlyConfigured: Error loading MySQLdb module: No module named MySQLdb

Incremental Knobs.

We may want to use an incremental approach where OtterTune dynamically increases the number of knobs used in a tuning session over time. Details can be found in our paper.

Settings to replicate paper

Hi all,
I'm trying to use ottertune to replicate the results obtained in "Automatic Database Management System Tuning Through Large-scale Machine Learning".
More specifically, I'm trying to run the TPC-C workload of OLTP. I've seen that the number of warehouses (scale) has been set to 200, but I can't find the number of terminals nor the weights for the various transaction types. Could you please share the xml configuration file you used?

I'd also like to know for how much time you let the workload run before evaluating the single trial, as I think that the default duration of 10 seconds is too short to obtain stable results. I'm orienting toward longer trials (~10 minutes) and discarding the first minutes to avoid warm up effects, what do you think?

Thanks in advance!

Below is a sample configuration provided by OLTP:

<?xml version="1.0"?>
<parameters>

        <!-- Connection details -->
        <dbtype>postgres</dbtype>
        <driver>org.postgresql.Driver</driver>
        <DBUrl>jdbc:postgresql://DB_IP:5432/</DBUrl>
        <username>postgres</username>
        <password>password</password>
        <isolation>TRANSACTION_READ_COMMITTED</isolation>

        <!-- Scale factor is the number of warehouses in TPCC -->
        <scalefactor>2</scalefactor> <!-- instead of 2 -->

        <!-- The workload -->
        <terminals>2</terminals> <!--instead of 2 -->
        <works>
                <work>
                        <time>10</time> <!-- instead of 10 -->
                        <rate>10000</rate>
                        <ratelimited bench="tpcc">true</ratelimited>
                        <weights>45,43,4,4,4</weights>
                </work>
        </works>

        <!-- TPCC specific -->  
        <transactiontypes>
                <transactiontype>
                        <name>NewOrder</name>
                </transactiontype>
                <transactiontype>
                        <name>Payment</name>
                </transactiontype>
                <transactiontype>
                        <name>OrderStatus</name>
                </transactiontype>
                <transactiontype>
                        <name>Delivery</name>
                </transactiontype>
                <transactiontype>
                        <name>StockLevel</name>
                </transactiontype>
        </transactiontypes>
</parameters>

Workload Issue

The workload model is kind of messy. (1) We do not distinguish the training data workloads and the tuning data workloads. (2) Also, OtterTune thinks two workloads are same if they have same names. If two users both name their workloads as 'workload-0', OtterTune will treat them as the same workload now. (3) When deleting a session, the corresponding workloads are not deleted.

Controller Signal

When the controller is not specified a running time, it will continue to run until it receives a 'stop' signal.

PostgresParser convert_integer - Raise ValueError for negative integers

The BaseParser class convert_integer and convert_real don't perform any checking to see if a value is negative.

I think it would be reasonable to expect knobs to have non-negative values. A ValueError should be raised if a PostgresParser is given a negative knob value as input.

update Vagrant

I have not tested Python3 on Vagrant. Update Vagrant to support it. Remove supervisor in the fab file if necessary.

What's your plan about open source this project

@dvanaken , it is so excited you make this project open source, according to the article https://aws.amazon.com/cn/blogs/ai/tuning-your-dbms-automatically-with-machine-learning/ the OtterTune project consists of several module like Tuning Manager， JDBC Manager and so on. It seems the current release is incomplete, we can't run a complete scenario with OtterTunne and the Website project.
You also mentioned you will make it as online tuning service, so I wondering what's your plan about this project, will fully open source it or just part of it. If you will open source a 'work' release, what about the schedule?

Thanks.

convert_numeric_metrics in BaseParser

Convert_numeric_metrics is a method in the BaseParser class that iterates over the numeric_metric_catalog_ field.

There's a redundant conditional within the for loop that checks if the metrictype of a specified metric is MetricType.COUNTER. By definition, the numeric_metric_catalog_ contains only COUNTER metrics.
Method assumes that every numeric metric is found within the input metrics argument. This would raise a KeyError if not true. A fix would be to replace it with the 'get' method an check if the value of the metric is None.
Convert_numeric_metrics uses convert_integer rather than convert_real, then uses "float(converted) / observation_time". There is a possible loss of precision.

Constant File

We have some hardcoded variables now. It's better to have a constant file and user can set all these variables in the constant file.

e.g. The number of samples is hardcoded.

ottertune/server/website/website/tasks/async_tasks.py

Line 282 in 8d27d0a

num_samples = 20

The selected knob number is also hardcoded.

ottertune/server/website/website/tasks/async_tasks.py

Line 377 in 8d27d0a

 pipeline_data, unique_workload, PipelineTaskType.RANKED_KNOBS)[:10] # FIXME (dva) 

Collector doesn't support Postgres9.3

pg_stat_archiver statistics collector didn't come to PostgreSQL till 9.4. As a result, collecting data fails.

We should add some way to check versions of the DBMS in the collector before looking for stat views that don't exist yet.

PostgresParser missing convert_string and format_string

BaseParser leaves the methods convert_string and format_string unimplemented. PostgresParser does have knobs and metrics with VarType.STRING.

BaseParser parse_dbms_metrics supports global and local scopes only

Postgres96Parser metrics have different scopes - database, table, index, global. Currently, the BaseParser machinery supports global and local scopes only, and raises Exceptions otherwise.

Postgres96Parser needs an implementation of parse_dbms_metrics that doesn't rely on the BaseParser's parse_dbms_variables method, or the BaseParser's parse_dbms_variables method needs to be able to deal with various common scopes, or condense them into one scope.

Vagrant File Synced Folder

Vagrant has the capability to sync folders between the host machine and the VM.

This is potentially more secure than symlinks and portable to other hypervisors since Vagrant handles the configurations.

Probably helps deal with the long filenames and development process.

User Define Function -Client side

When the driver stops OLTPBench and controller, OLTPBench will output some statistics like latency, and controller will output some collected data (knobs.json, metrics_before/after.json, summary.json) . You may want to parse the OLTPBench output file and write it into a new file target_objective.json, which has the information of target objectives and their values (e.g. latency)

Then you will upload the target_objective.json together with four other json files collected from controller to the OtterTune server.

ottertune/client/driver/fabfile.py

Line 178 in eeebeb9

upload_result()

Fix the PR #147 first.

Deployment Instructions?

I'm interested in trying out OtterTune, and I'd love more detailed information about where to start! I can see that most questions about deployment were closed last November with notes that more detailed instructions would be coming soon, is there an estimate on when more detailed information might be available, or any pointers on where to begin in the meantime?

How to install ottertue?

Hi,
Is there any document for the full installation ? is the website already included in the ottertune ?
Thanks

move parameters to the constant file

Move the parameters in the analysis part to the constant file. ottertune/server/website/website/settings/constants.py

Tune these parameters if you have free time.

Upgrade to Python 3

Migrate to Python 3 ASAP to support GPFlow.

Change Password

It seems that we do not have a page view to change the password for users. Can anyone help with it ?

Port Ottertune to Python3

As seen here: https://pythonclock.org/ , python2 is being retired in the next two years. It would be very handy to port it to python3 as soon as possible. One idea is to use 2to3 tool that allows for an automatic way of rewriting the source files of the project. It can be executed as a dry-run so that you can see the impact of the changes. The main impediment is that some libs like poster are not ported to python3 (one of the motivations to my PR #49 , besides requests being awesome) , but this one in particular looks like abandonware, imho.

Wdyt, @dvanaken ?

Add support for Postgres9.3

Add support for Postgres 9.3

Add fixture here
ottertune/server/website/website/fixtures/dbms_catalog.json
Add Postgres9.3 parser here

ottertune/server/website/website/parser/parser.py

Line 28 in 8c0948d

DBMSCatalog.objects.get(

and

ottertune/server/website/website/parser/postgres.py

Line 114 in 8c0948d

class Postgres96Parser(PostgresParser):

Starting Points

For now we only use random samples as starting points in GPR.

ottertune/server/website/website/tasks/async_tasks.py

Line 285 in 490a9d5

X_samples = np.empty((num_samples, X_scaled.shape[1]))

We may want to add the previous top K configs with the best performance as the starting points. K is a hyperparameter included in the constant file ottertune/server/website/website/settings/constants.py
The default K value is 20

Validate knob inputs and outputs

OtterTune currently does not validate the values of the knobs when a user uploads new data to the website. It also does not validate them before creating the next configuration for the user to run.

We should perform the following checks for each knob using the information in the KnobCatalog:

If it's an integer or real, check that the value is between minval and maxval
If it's an enum, check that the value is in the enumvals list
If it's a bool, check that value is one of the formats supported by the DBMS (e.g., true, false, 0, 1)

Fixtures for different DBMS versions

This is not urgent. But I think we can have a better way to store the fixtures for different database versions. For example, Most of knobs and metrics in Postgres 9.2, 9.3, 9.4, 9.6 are same. We can only store these common knobs/metrics in a single Postgres fixture, and store the version-specific knobs/metrics separately. For now we have duplicated knobs/metrics in all Postgres versions.

ottertune/server/website/website/fixtures/

OLTPBench Log

Run OLTPBench on the ottertune-dev, and write a python file to check its LOG periodically (like every 1 second). The OLTPBench will create some terminals, start the experiment, finish the experiment and kill the processes. Your python file should know when OLTPBench starts and finishes the experiment by checking its Log. Then send a signal to the controller to start and terminate. You may want to add the glue code in the driver code.

ottertune/client/driver/fabfile.py

Line 150 in 490a9d5

# run oltpbench as a background job

cmu-db / ottertune Goto Github PK

ottertune's People

Contributors

Stargazers

Watchers

Forkers

ottertune's Issues

Recommend Projects

Recommend Topics

Recommend Org