fenilgmehta / jaltantra-code-and-scripts Goto Github PK

View Code? Open in Web Editor NEW

0.0 1.0 0.0 487 KB

Code and Scripts written for JalTantra Project

License: GNU General Public License v3.0

Shell 4.09% Python 86.68% R 9.23%

jaltantra-code-and-scripts's Introduction

Jaltantra-Code-and-Scripts

NOTE: This has been moved to https://github.com/jaltantra/Jaltantra_loop

Code and Scripts written for Jaltantra Project

Shared Drive Folder
- Time vs Objective Function Value
Programs/Scripts
- auto_run_model_and_data_in_parallel.py - Automatically execute solver for multiple model and data files while monitoring the system resources
- htop_monitor_writter.sh - Monitor system resource usage using htop and save them to a file at regular interval
  - htop_monitor_reader.sh - Read the output generated by the above monitoring program
- output_table_extractor_baron.sh - Extract important values from the output generated by Baron solver
- output_table_extractor_octeract.sh - Extract important values from the output generated by Octeract solver
- CalculateNetworkCost.py - Automatically execute multiple Solvers and multiple Models on input graph/network (i.e. data/testcase file) and return the best solution
  - python3 CalculateNetworkCost.py -p Files/Data/m1_m2/d9_HG_SP_4_2.dat --solver-models 'baron 1 2' 'octeract 1 2' --time '0:5:0' --debug
  - To clean up the temporary file created by this program in the /tmp directory, execute the below commands
```
# Make sure that the below GLOB does not match any other important file or
# directory created by someone other than CalculateNetworkCost.py
# And, do not perform this clean up when any instance of CalculateNetworkCost.py is running
rm -r /tmp/pid_* /tmp/at*octsol /tmp/baron_tmp*
```
- CalculateNetworkCost_ExtractResultFromAmplOutput.py - Extract the following values from stdout/stderr logs of AMPL + Solver
  1. head for each node
  2. flow for each arc/edge
  3. pipe ID and pipe length for each arc/edge

Overview

Original thing	Similar to
AMPL	Shell
Solver	Java bytecode interpreter
Model file	Algorithm / Java function
Network/Data file	Testcases / Input / Function parameters

AMPL is like shell because we use it to define variables pointing to data file and model file, and then execute the solver on them. Similar to
```
solver --data-file DATA_FILE_PATH --model-file MODEL_FILE_PATH --solver-parameter1 value1 --solver-parameter2 value2 ...
```
Solver is like bytecode interpreter because multiple bytecode interpreter exists and they work in their own way ( Oracle interpreter, OpenJDK interpreter, Kotlin interpreter)
Model file is like an algorithm
Network/Data file has Graph data

AMPL Commands

AMPL execution commands are present in Files/main.run

Model Files

Present at Files/Models
These are written in R language
According to the AMPL Dev User Guide, these files should have .mod file extension. However, we have used .R so that we get syntax highlighting in text editors like VSCode and SublimeText

Unique Name	Original Name
m1	Basic
m2	Basic2 Basic2_v2
m3	Descrete Segment
m4	Parallel Links

Data Files

For model m1 and m2, data files are in Files/Data/m1_m2
For model m3 and m4, data files are in Files/Data/m3_m4
According to the AMPL Dev User Guide, these files should have .dat file extension. However, we have used .R so that we get syntax highlighting in text editors like VSCode and SublimeText

Unique Name	Original Name
d1	Two Loop
d2	Cycle Hanoi
d3	Double Hanoi
d4	Triple Hanoi
d5	Taichung
d6	HG_SP_1_4
d7	HG_SP_2_3
d8	HG_SP_3_4
d9	HG_SP_4_2
d10	HG_SP_5_5
d11	HG_SP_6_3

jaltantra-code-and-scripts's People

Contributors

Watchers

jaltantra-code-and-scripts's Issues

Use SHA-256 hash instead of MD5

Why move away from MD5 hash algorithm ?

MD5 is not collision resistant (source of this information). So, there is a possibility that two network files have the same MD5 hash and thus resulting in unexpected situation if two or more network files having same MD5 hash are submitted for processing to CalculateNetworkCost.py#L160.

Why SHA-256 hash algorithm ?

SHA-256 is collision resistant to such a good extent that even Bitcoin uses it (source of this information).

`0_status` file content and its usefulness

Existing System

Current format of 0_status file

Line 1: Status (success => True, failure => False)
Line 2: Solver Name
Line 3: Model Name (unique short form)
Line 4: std_out_err file path of Solver-Model combination
Line 5: Best value of the objective function that was found by the Solver
Line 6: Solver specific result file to parse (like .../NetworkResults/SolutionData/at64992.octsol or .../NetworkResults/SolutionData/baron_tmp284198/res.lst)
Line 7: Objective value extracted from "solver specific result file"
Line 8+: Solution vector (i.e. variables with names x1, x2, x3, ... and their values) extracted from "solver specific result file"

Proposed Change

In this old format, the content from line 6 onwards is of no use. So, it would be better if we could have the output of CalculateNetworkCost_ExtractResultFromAmplOutput.py (program link) from line 6 onwards.

This would have three advantages:

CPU usage Optimization: We will not have to call CalculateNetworkCost_ExtractResultFromAmplOutput.py every time the user requests for solution of the same network. The processing and parsing of std_out_err.txt file done by CalculateNetworkCost_ExtractResultFromAmplOutput.py (using awk and python) would be done only once (from CalculateNetworkCost.py). And, in the JalTantra backend, we will just have to read the 0_result.txt file to get the values which are to be presented to the user.
Performance Optimization: Since the processing and parsing of std_out_err.txt file would be done by CalculateNetworkCost.py, the Java/JSP program will not have to spend time waiting for CalculateNetworkCost_ExtractResultFromAmplOutput.py. So, it will just have to read 0_result.txt to get the solution data.
Save Storage Space: In case of storage shortage, we can delete everything except 0_status and 0_result.txt file for each network, and our server will continue to work as it is (i.e. it will continue to respond to already submitted requests from the information saved in 0_status and 0_result.txt).

Solvers do not stop immediately after receiving signal from `CalculateNetworkCost.py` using `kill -s SIGINT <PID>`

Bug Description

Solvers do not stop immediately after receiving signal from CalculateNetworkCost.py using kill -s SIGINT <PID>
This bug results in SolverOutputAnalyzer.<SolverName>_extract_best_solution(...) functions to think that the Solver-Model combination failed due to some reason. But, the truth is that the solver is waiting for something to happen (e.g. establish connection with their server), and once that event occurs, the solver returns the control back to AMPL.

Expected Behaviour

CalculateNetworkCost.py program must wait for the tmux sessions to end before proceeding to extract_best_solution(...)

File(s) that need to be updated

CalculateNetworkCost.py

Increase logging to get latest status of the progress made by `CalculateNetworkCost.py` from output/logs

Description

Currently, if CalculateNetworkCost.py terminates due to some unhandled cause, then it is not possible to know when it occurred. However, if we print log messages before and after every major task, then it would be easier to:

pinpoint unexpected bugs and program terminations
know how much the program has progressed in its execution based on its output/logs

File(s) that need to be updated

CalculateNetworkCost.py

`CalculateNetworkCost.py` does not catch AMPL error

What is the bug ?

The following error is not caught by CalculateNetworkCost.py

Sorry, a demo license for AMPL is limited to 500 variables
and 500 constraints and objectives (after presolve) for linear
problems.  You have 756 variables, 72 constraints, and 1 objective.

However, the below error does get caught

Sorry, a demo license for AMPL is limited to 300 variables
and 300 constraints and objectives (after presolve) for nonlinear
problems.  You have 826 variables, 143 constraints, and 1 objective.

Input file

Input file to reproduce this error with demo license of AMPL is here
Its SHA256 hash is 0a2f2a9181691e7fe8a3b8a21cbcc42ac78958f22a1cdddaf3fd2561bbf49fc2

Exit code of `tmux ls | ...` varies based on the command to which the output is piped

Description

There are two cases:

Command = tmux ls | grep "{g_settings.TMUX_UNIQUE_PREFIX}"
- Exit code is 1
- Result returned by run_command(...) and run_command_get_output(...) is the parameter default_result
Command = tmux ls | grep "{g_settings.TMUX_UNIQUE_PREFIX}" | wc -l
- Exit code is 0
- Result returned by run_command(...) and run_command_get_output(...) is
```
no server running on /tmp/tmux-1000/default
0
```

File(s) that need to be updated

CalculateNetworkCost.py

True best solution is not always chosen due to rounding

What is the bug ?

For an input file in which model m1 is not able to find the global optimum (i.e. it gives a local optimum), but model m2 finds the global optimum; then extract_best_solution(...) function will say that model m1 is better because of the rounding error (described below).

Details

Update the below function (in file CalculateNetworkCost.py#L881) to extract best solution based on true objective function value and not approx value.

def extract_best_solution(tmux_monitor_list: List[NetworkExecutionInformation])

The current code works for almost all cases. But, there are some corner cases were a slightly suboptimal solution "may get selected as the best solution" due to the rounding off (that happens due to the Scientific Notation of numbers) that is done by the Solvers like Baron and Octeract when printing the table which represents the state of the execution.

Example

420128.37368597434 becomes 4.201e+05 when the solvers print the table representing the state of the execution, and when this value is read as string and converted to float by Python, it becomes 420100.0
- Input file = modified Sample_input_cycle_twoloop.xls
- Link = Jaltantra-Website/Downloads/Sample_input_cycle_twoloop.xls
  - Deleted the rows 40 (link 7 --> 5) and 37 (link 3 --> 5)
- Modified input file SHA-256 hash = 495b0d36232876eb5f70b8b8f9fc0397a8f2f3a65355059e0ceb35940ec6c9a7
- Network file SHA-256 hash = 8af7e8c1cda61aecd09933f6e9a467a54f74f29f4e0efd2cbe40ebcc1239f4b0
- Model = m2
- Solver = Octeract
- Solver specific solution file = at80535.octsol

Possible Solution

Print the variable total_cost using the display command of AMPL and use it to select the best solution instead of using bash scripts (having glob output_table_extractor_*)

option display_precision 0;
display total_cost;

Extra Info

For checking whether the solver-model combination (where solver is Octeract) solved the input to global optimum or not, the following condition seems enough:

# REFER: https://docs.octeract.com/sf1002-exit_flags
octsolJson['statistics']['dgo_exit_status'] == 'Solved_To_Global_Optimality'

Store information regarding execution time limit for all requests

Summary

Store info regarding how much execution time limit was given for a "network file request" to CalculateNetworkCost.py

Example

In future, as the scope of this project increases, there is a possibility that a user sends a request for a network with the time limit of 2 minutes. And, the next day another user sends the same network with the time limit of 120 minutes.

OBSERVED RESULT: With the current code, the result of 2 minute execution will be returned.
EXPECTED RESULT: Expected outcome is that the network is solved for 120 minutes and the new results are returned, because the old results are of 2 minute execution only.
POSSIBLE SOLUTION: Store network hash and execution time limit in the database (or is some other file). If the execution time limit in the database is greater than or equal to the new requests execution time limit, return the already solved result, otherwise launch "CalculateNetworkCost_JaltantraLauncher.sh" for the new request.

Refactor code of AMPL error checking

Suggestion

Move the code of AMPL error checking under the functions SolverOutputAnalyzer.<SolverName>_check_errors(...) (in CalculateNetworkCost.py) to a common function

Advantages

This will help avoid multiple copies of the same code under the functions SolverOutputAnalyzer.<SolverName>_check_errors(...)
This will ensure that AMPL errors are caught for all solvers with just one common function call
This will save us from the below situation:
- Code to catch an AMPL error is written for Solver S1 function SolverOutputAnalyzer.s1_check_errors(...) but not copied to SolverOutputAnalyzer.s2_check_errors(...)
- Such mistakes will lead to inconsistency in AMPL error catching for different solvers that are integrated with CalculateNetworkCost.py

File(s) that need to be updated

CalculateNetworkCost.py

fenilgmehta / jaltantra-code-and-scripts Goto Github PK

jaltantra-code-and-scripts's Introduction

Jaltantra-Code-and-Scripts

Overview

AMPL Commands

Model Files

Data Files

jaltantra-code-and-scripts's People

Contributors

Watchers

jaltantra-code-and-scripts's Issues

Why move away from MD5 hash algorithm ?

Why SHA-256 hash algorithm ?

Existing System

Proposed Change

Bug Description

Expected Behaviour

File(s) that need to be updated

Description

File(s) that need to be updated

What is the bug ?

Input file

Description

File(s) that need to be updated

What is the bug ?

Details

Example

Possible Solution

Extra Info

Summary

Example

Suggestion

Advantages

File(s) that need to be updated

Recommend Projects

Recommend Topics

Recommend Org