The mad from mar-file-system

mad's Issues

Consider adding logging to the code

I've run across a few print statements that lead me to believe I was having trouble debugging something in the past. I should add logging to all the code to make debugging easier in the future.

add gpfs device to config file

This value is used really frequently for gpfs operations. It seems silly to specify it for every command.

Consider deploy mounts command

Currently there is a bash script for mounting NFS things into the proper place on metadata, batch, and interactive nodes. If there are multiple repos you need multiple copies of the file to do the mounts. It may be easy to add that functionality to a deploy option that would work with the config file and for any repo, while still being able to keep the parts that work without modification as is such as starting fuse.

delns doesn't complain if it does nothing

if you ask for a nonexistent namespace in a nonexistent repo to be removed it will happily do nothing and not raise an error

create zpool only creates pools with 20 disks

Maybe we can have a default set but make it optional to use a different number, in case a different deployment needs a different number of disks per pool

Refactoring

There are some things I believe I did wrong in the way I wrote my code. I think I should go back and add type hints to everything. I would also like to clean up some anti patterns like returning two different types in one method/function. That should be changed to raise an exception if the proper value cannot be returned, and then handle the exception instead of handling a wrong type. I think there are also possibly try excepts that don't actually handle the exception usefully, so that needs to be updated as well.

I also need to clean up some of the rather abbreviated variable names

More descriptive variable names
Raise exceptions instead of returning None
Add type hints top to bottom

Consider tools for config struct generation for C

In the past I used a "quickobject" which would happily read any config file as long as it was XML and made sense and create databindings for the config file. This could be useful for the future, and could possibly be restored if needed. This could also serve useful for generating header files with the C structs needed by MarFS to aid in quick prototyping on the MarFS side.

jti says

Don’t forget about the need to generate a .h file, containing C struct declarations, if the fields of the abstract XML change. Perhaps we can just do that by hand, from now on.

Finalize config file format

The config file format is up in the air right now. We need to get a final form so the config objects can be updated, and to fix any bugs that might be created by small name changes etc.

config path must be absolute path to be able to deploy remote

If you give a relative path at command line for mad it will break later when trying to deploy remotely

Need a deploy namespace option

There should probably be a way to deploy a single namespace. Right now I just point everything to deploy the whole repo. So if you added namespaces but did not deploy them then add and deploy one new namespace it will deploy them all.

consider list namespaces command

It might be helpful to list the namespaces in a repo. There is a lot of noise in config files we don't want to see always.

Document weaknesses of config parser in python

~~Only string values can be used for conversion to XML. If you set a value that looks like a number to an object of type int the xml process gets real sad~~
There can be no empty values in the config file. I don't see any reason for that to be problematic but issues could rise if we don't stick to that limitation. None type values result in the element being skipped entirely so you will be missing an element.

Namespaces must have unique names across all repos

If you have two repos and they both have a namespace named "spamandeggs" you will end up with a fileset collision in GPFS. Maybe we could change the fileset naming to be reponame+namespacename to avoid this

Verify code works with multiple jbods on one node

Ensure setting up a repo works across multiple jbods, or that you can select a jbod.

Unit Test Plan

There needs to be unit tests to make certain methods function correctly. These should be separate from tests that will test deployments. I think the GPFSInterface and ZFSInterface will have to be marked for "cluster" tests to only run when testing on a cluster. The NodeBase StorageInterface and MetadataInterface can all run on a single node. They can probably also run on the cluster examples, so they should be compatible with that plan.

Unit Tests

Consider moving ZFS setup scripts to new files

ZFS set up is likely outside of the everyday admin responsibility of working with MarFS. It is a task you only really do once with your system. Automating that setup is useful especially for test systems, but might not be the best use of time as far as creating quality code.

Preserve XML comments when converting to python

Currently it seems the config comments are lost when creating python objects from XML. These comments need to be preserved and find their way back into a new file

Deployment test plan

~~There is currently no way to check an existing deployed marfs repository. There should be a method in the storage node tools that will verify everything is set up correctly.~~

There should be unit tests that can be run to identify problems with a deployment like not enough scatters, or missing a cap directory.

We should create deployment tests for each interface I think

GPFS Node
ZFS Storage Node

Create broad test plan with a complex config file

There needs to be some testing performed with multiple marfs repositories in a config file to make sure we don't see problems with multiple repos. Multiple namespaces has been setup, but not actually tested.

make zpool creation smarter

zpool creation is based on the assumption of 1 jbod. I need to add a way to get disks in a specific jbod, and use those to make zpools. When the disks are collected to create zpools there is no distinction between JBODs it will just get disks. There should be a way to target a specific scsi device ID when creating zpools

Update config file interface and functionality to match new MarFS features

There have been a number of changes to the MarFS config file in other work. I need to update the config structures and probably some functionality to work correctly.

Need unit tests for ConfigTool

Update front facing files to use argparse instead of fire

Fire was used in development for quick command line testing. There is no real need to depend on Fire for production so we should switch files with a command line interface to use argparse instead.

Create config file from python config obects

Currently we can read in a config file and use that to create python like objects to interact with the configuration data. But we can not maintain changes to the config because there is no process for converting the python objects back to xml. That process needs to be created so it will be possible to make changes to the config in python then write the changes back to the file.

Config command line tools

There is a need for a command line interface to modify marfs config files. Once #4 is complete we can get started on tools for this. We want to be able to:

add/delete/change namespaces from command line
add repositories from command line
set all namespaces in a REPO to read only
create many namespaces at once from command line with a default set of options

mar-file-system / mad Goto Github PK

mad's People

Contributors

Watchers

mad's Issues

Unit Tests

Recommend Projects

Recommend Topics

Recommend Org