rjsears / chia_plot_manager Goto Github PK

Python scripts to manage Chia plots and drive space, providing full reports. Also monitors the number of chia coins you have. Auto Drive helps to automate the addition of new hard drives to your system and to the chia config.

License: MIT License

Python 89.13% HTML 1.84% Shell 9.03%

chia-plots chia-blockchain plot-manager chia-plot plot-drives drive-manager plotting-server

chia_plot_manager's People

Contributors

Stargazers

Watchers

chia_plot_manager's Issues

install.sh fails using zsh on 0.94

Using zsh as my default shell, on 0.9.4 the install.sh fails with a:

-e -n Should we CONTINUE?
install.sh: 44: read: Illegal option -n
install.sh: 45: install.sh: [[: not found
install.sh: 247: install.sh: Bad substitution
root@harv1:~/plot_manager# bash install.sh nas

Simply re-running it using bash install.sh nas fixes it.

Add code for moving plots created locally to final dest drive

When running NAS as a plotter as well, need to have code to move plots off -d drive onto final plotting drive.

Is receive_plot.sh missing in this repo?

Questions about drive skel and mapping

Hi!

Fantastic script. Just getting it set up, but I'm a little confused on auto_drive.py.
I can see skel files in ~/plot_manager/extras/drive_structures, are these actively sourced by auto_drive.py or does it rely on a file elsewhere to build out the mountpoints?

I've got 4 enclosures of 24 drives each, all front so I'm going to need to edit this to reflect my setup.
Do I just copy one of the skel files and it picks it up accordingly, or is there something I'm missing?

Lastly, after running auto_drive.py without any changes (and unmounting and wiping 2 drives) it only appears to pick up 1. Is this intentional so that one can re-run auto_drive.py as each physical drive comes online, or?

Thanks again for putting this together and releasing it! It'll be a huge help.

Import Error with psutil._common

First of all, thank you for putting so much time and effort for making this. I was almost going to attempt doing this (probably a much simpler version since I'm pretty newbie with coding).

I believe I've done all the required setting up, I've changed the directories to suit mine etc.

When I was trying to run drive_manager.py, below error came up:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/txny/chia_plot_manager/chianas/drive_manager.py", line 68, in <module>
    from psutil._common import bytes2human
ImportError: cannot import name 'bytes2human' from 'psutil._common' (/usr/lib/python3/dist-packages/psutil/_common.py)

Any ideas? I do have psutil installed.

Question about speed that doesn't make sense

This is more than likely a research assistance request. When I have ran out of space on any plotting director (not temp) or when a plot is transferring from temp to plotting my speed of ncat drops significantly, from 110M/S to sometimes < 10M/S. This seems like an issue that shouldn't exist due to the hardware layout. Here is an example of my NAS and Plotter:

NAS:
24 CPU
100 GB Ram
LSI 9205-8e connected to 3x NetApp DS4243
72 Drives

Plotter
24 CPU
100GB Ram
Plot Dir (/mnt/hdd/hdd0) 2 TB WD Blue SSD
Temp Dir (/mnt/nvme_drive0) Raid 0 2x 1TB Samsung 970

I am using plotman with MadMax, any time a plot is in 4:1 (transferring from temp to plot dir) I see the drop in speed. This is even worse if the temp dir fills up and the 1 or 2 plots sit in 4:1 for an extended period of time.

I am trying to identify why this is happening. This didn't seem like this was an issue before I moved from .92 to .95 but I can't be sure.

I am looking for some help trying to figure this out, I have a plotter with a smallish plot dir and if it happens to fill up then transfers off the plotter start taking 10x as long and thus the entire group of plotters end up waiting in line and getting backed up.

Create way to offline drive

Need to create a way to offline a drive for replacement where you can mv or copy off the plots to a new drive and not have drive_manager.py keep putting plots on it.

per_coin = True in coin_monitor_config

Hello, i am not a software deloveper, i was just trying to use your script -> coin_monitor.py

i could not run your script without changing per_coin = True to per_coin_email = True in coin_monitor_config

after chaning that it is working.

maybe this is something to fix. i just wanted to let you know. thanks.

Dual 10Gbe link code

Update code to add the ability to do multiple plot copies via individual 10Gbe links.

Send multiple plots at once

When we detect that there is more than one plot to be moved, open up multiple connections (maybe 2 x on 10Gbe) to the NAS to receive the plots.

Check all drive entries in config file - Do they actually exist?

With the new config file setup we want to check that any drive that is included as part of the config actually exists to prevent an error down the road...

Can we make it work on windows too?

I started all my work on windows. I cannot go to linux now, i have to many plots and many more comming. It would be great to have a windows version too. Thank you :)

Move from argparser to click???

Trying to decide if I should move from argparser to click for this project.....

Multiple plotter support

Does this support a single-nas, many (10+) plotter node configuration?

Automated new drive formatting, mounting and updating chia/plot_manger configs

Code that will identify when a new drive is added to the system detecting that it is not mounted not has any partitions on it, will partition it, format it xfs, mount it, enter the required entries in /etc/fstab and update plot_manager and chia with the new mount point information.

Add a lot more error checking

Need to research the best way to do more error checking and cleanup so errors don't stop system from working.

mad max support

Does this manager support mad max plotter? If not, please, add it

Missing keys in default config file

To reproduce:

follow the instructions and install with install.sh
try python3 drive_manager.py

We get this:

Traceback (most recent call last):
  File "drive_manager.py", line 115, in <module>
    chianas = DriveManager.read_configs()
  File "/root/plot_manager/drivemanager_classes.py", line 115, in read_configs
    total_plot_highwater_warning=server['harvester']['total_plot_highwater_warning'],

and if we rectify the problem by adding the missing key, this follows:

Traceback (most recent call last):
  File "drive_manager.py", line 115, in <module>
    chianas = DriveManager.read_configs()
  File "/root/plot_manager/drivemanager_classes.py", line 116, in read_configs
    total_plots_alert_sent=server['harvester']['total_plots_alert_sent'],
KeyError: 'total_plots_alert_sent'

This is because total_plot_highwater_warning and total_plots_alert_sent are directly accessed in drivemanager_classes.py but not present in the default config file.

Fix deps in install.sh

There are some more python and (in my case) apt dependencies than aren't included in install.sh currently.
For example, glances-api (does it really depend on Glances, or can this be omitted?), nc,

Probably needs an installation on a fresh install to get the exact list. Offhand I believe I added ~5-6 python packages and ~3 packages via apt.

Script acknowledge

Would you please expose your chia command script or one single commnad line considering your plotter-harvester async structure? And plotman script based on one local plotter? Thank you~!

help needed

Hey there, you did a really good job with the Chia plot manager. I'm looking for some advice to set up my Chia plotting and farming, same kind of system that you have. would you give me a little hand to pick the devices? also I was wondering how many plots you were able to make a day with your system?
thanks in advance!
best,
olivier

Fix coin_monitor.py for new log file format

1.1.1 changed the way the logfiles are formatted. Need to update coin_monitor.py for this new logfile.

Add plot check and verification

Match plot count against plot verification from chia logs.

Consolidated plot reporting for multiple plotters

Need to write code to provide consolidated plot reporting across multiple harvesters.

First plot of the job end after 20 minutes

Hi!
When i put 3 plot in paralell in the same job, the first end after 20 minutes, that generate one new plot (only 3 in phase 1 whit my 1tb nvme) and all continuos fine.
Just the first plot have this issue, the others plots goes to the end. Thx!

Multiple Plotters

Hey! Love the script, thanks for all the work you've put into it. I have 3 separate dedicated plotting machines, the plot_manager.py and kill_nc.py scripts (as well as send/rec plot scripts) are designed for a single plotter. Have any suggestions or thoughts around removing this limitation.

Initially I decided to just append the hostname to the remote_checkfile variable, this way the plotters would not kill each others netcats, but multiple netcats over a gigabit network slows down the the transfer from approx 110MB/s on a single file, to a total of 60-70MB/s across all files. I'd love to avoid that loss of speed if possible, I am already at the cusp of needing 10GBE .

receive_plot.sh empty with non-standard file structure (no front/rear)

Hi,

As alluded to in my earlier issue (#52), my drive structure doesn't include a front/rear because all my enclosures are front-facing. As such, I removed this from my drive_structures file.
I also go by rows rather than column (but I expected there to be minor issues with that), so that required some minor work.

However it appears drive_manager.py is hardcoded to look for the front/rear mountpoint pattern, failing to create a working receive_plot.sh upon first run of python3 drive_manager.py. Which in turn leads to unsuccessful plot transfers (I was getting "Remote NC kill!").
It would be fantastic if it could deduce the pattern from the drive_structures file instead.

To manually fix this for now I had to:

Open drive_manager.py, for get_path_info_by_mountpoint change the second entry from:

    elif info == 'column':
        return (mountpoint.split("/")[5])

To:

    elif info == 'row':
        return (mountpoint.split("/")[4])

Remove receive_plot.sh, transfer_is_active (on the harvester) and transfer_job_running (on the plotter side)
Re-run python3 drive_manager.py on the harvester
Re-run python3 plot_manager.py (or wait for the cronjob to kick in).

Add plot progress monitor

Need to add a progress monitor that looks at the wall time of a specific plot and sends a warning if it is getting beyond a user-specified value. This can alert you to different issues (full plot drives, full temp drive, etc) before it becomes a problem.

Add plot verification

Add in plot verification against a number of plots reported on system.

auto_drive installation type in the install script but not usable

Tried to install using install.sh and it the option auto_drive is not a valid installation type. I checked in install.sh and it does not accept the option for it. Is it located somewhere else?

update plot report for offline drives

Update the plot reports to take into account drives that are offlined so that space is not included in the totals.

Check for failed or stalled local plot moves

Figure out how to check and see if a local plot copy has failed and restart the process so we don't overfill our local -d drive.

Add per plot notification

k32 plots have different sizes

Are we testing?

testing = False
if testing:
plot_dir = '/home/chia/plot_manager/test_plots/'
plot_size = 10000000
status_file = '/home/chia/plot_manager/transfer_job_running_testing'
else:
plot_dir = "/mnt/ssdraid/array0/"
plot_size = 108644374730 # Based on K32 plot size
status_file = '/home/chia/plot_manager/transfer_job_running'

remote_checkfile = '/root/plot_manager/remote_transfer_is_active'

Just saw that in installation instruction, after checking my files i saw that they have different sizes. looks like according to this code file size have to be exatly 108644374730 or it wont be validated as successful plot. but all plots are valid even they have different sizes...

replace_plot() Script on Failed

I'm getting this error: "replace_plot() Script on Failed"
What does it mean?
Help pls.
ps i have mount points like
"/mnt/enclosure0/column0/drive0"
so it must work with condition
p.device.startswith('/dev/sd') and p.mountpoint.startswith('/mnt/enclosure') and p.mountpoint.endswith(drive)

Add more robust error checking

Need to add more robust error checking on both the plotter and NAS side to monitor for broken plot transfers and alert us to the fact something is wrong. This could look at multiple things such as a failure of the plot check.

We could create a function that internally monitors the plot moves and triggers an error/warning/notification if it failes.

Every once in a while we have a plot move get stuck and we need to know why and be able to recover before it stops our plotting.

Add -t and -d drive monitoring

Add monitoring to watch -t and -d drive space and alert user when it goes over a user-defined setting.

plot_manager.py - Update proper plot identification

Ran into an issue where we have a plot that was named .plot but was not the right size and the system stopped processing plot moves as a result. Need to fix this!!!

Flaw in auto drive mounting for SAS drives

The script will attempt to add SAS drives (which show as two drives e.g. sdb & sdh) twice. These both have the same UUID naturally, but since the script uses unique drive letters it doesn't see that these are duplicates.

Add temp drive life remaining to reports

Add temp drive life renaming to daily plot reports.

Fix list index out of rang error

drive_manager.py throws indexerror when no drives are available to look at. Need to do an exception

root@chianas03:~/plot_manager# ./drive_manager.py 
send_new_plot_notification() Started
update_receive_plot() Started
Total Serverwide Plots: 0
Traceback (most recent call last):
  File "./drive_manager.py", line 747, in <module>
    main()
  File "./drive_manager.py", line 740, in main
    update_receive_plot()
  File "./drive_manager.py", line 500, in update_receive_plot
    if current_plotting_drive == get_plot_drive_to_use():
  File "./drive_manager.py", line 391, in get_plot_drive_to_use
    return (natsorted(available_drives)[0][0])
IndexError: list index out of range

Where do I find /root/plot_manager/kill_nc.sh ?

Where do I find the contents of the script /root/plot_manager/kill_nc.sh ?

Extend System Notifications

Create notifications based on environments (drive temps), smart drive reports, etc.

Add network traffic monitoring to "Plot in Progress" checks

Currently, we write a status file both on the sending and receiving end to make determine if we are moving a plot since we only want to move one at a time (at least for now). I want to add in physical monitoring of the network link between the NAS and the Plotter (with a selectable minimum bw rate) as an additional check to make sure we are really sending a plot.

Additionally, if we see the checkfile in place but we are below the network bw threshold, we can fore a reset and try to send the plot again.

make sentry.io optional flag

If you do not use sentry.io you have to remove the sentry capture_exception clauses, make this automatic with a flag.

Update chianas drive_manager.py to include check for active network traffic

In certain circumstances ./drive_manager.py can think there is a remote transfer going when there is not. We need to add in a network check like we do on the sending end so we can somewhat gracefully reset if a remote transfer failed and we still think it is going. This came up when I ran out of drive space!!

rjsears / chia_plot_manager Goto Github PK

chia_plot_manager's People

Contributors

Stargazers

Watchers

Forkers

chia_plot_manager's Issues

Are we testing?

Recommend Projects

Recommend Topics

Recommend Org