josephhardinee / pydsd Goto Github PK

View Code? Open in Web Editor NEW

44.0 10.0 33.0 3.97 MB

Python Library for working with disdrometer data.

License: GNU Lesser General Public License v2.1

Python 38.10% Shell 0.01% Jupyter Notebook 61.89%

python parsivel-data disdrometer-data remote-sensing dsd scattering t-matrix meteorology atmospheric-science

pydsd's People

Contributors

Stargazers

Watchers

pydsd's Issues

AttributeError: 'list' object has no attribute 'filled'

Hi @josephhardinee, I have a problem processing the file when I am trying to use functions with time. I am trying to use this codes using the ATM4.txt files

import pydsd
import matplotlib.pyplot as plt
import pydsd.plot
import pydsd.partition

filename = 'merged_file.txt'
dsd = pydsd.read_parsivel(filename)
dsd.calculate_dsd_parameterization()
pydsd.plot.plot_dsd(dsd)
plt.show()

Here is the error:

runfile('/home/jeff/.config/spyder-py3/DSD_sample.py', wdir='/home/jeff/.config/spyder-py3')
Traceback (most recent call last):

File "/home/jeff/anaconda3/envs/wradlib/lib/python3.10/site-packages/spyder_kernels/py3compat.py", line 356, in compat_exec
exec(code, globals, locals)

File "/home/jeff/.config/spyder-py3/DSD_sample.py", line 21, in
pydsd.plot.plot_dsd(dsd)

File "/home/jeff/anaconda3/envs/wradlib/lib/python3.10/site-packages/PyDSD-1.0.6.2+5.g0a54715-py3.10.egg/pydsd/plot/plot.py", line 72, in plot_dsd
dsd.time["data"].filled(),

AttributeError: 'list' object has no attribute 'filled'

Do you have idea to fix this? My programming skills are not that advanced. Thank you

PyDisdrometer Roadmap

After talking with @nguy the following represents what we see as the roadmap to getting PyDisdrometer to a V1 release.

Filtering: Drop #

Need to add more QC filters. First among them is a filter based on number of detected drops.

Also need to provide a notebook showing the new filtering steps.

Do you plan support Thies Clima disdrometer data?

calculate slope coefficient(Λ)

I want to calculate and plot slope coefficient of DSD.I have tried with moments method this way is it useful?

self.fields['mslope'] = {'data': np.zeros(len(self.time))}
self.fields['mslope']['data'] = self._calculate_mslope(self.Nd[t])
def _calculate_mslope(self):
G=np.divide(self._calc_mth_moment(3)*_2,np.multiply(self._calc_mth_moment(2),self._calc_mth_moment(4)))
mslope = np.divide(self._calc_mth_moment(2),self.calc_mth_moment(3))(G/(1.0-G))

    return mslope

Time Selection

Select the time you want from the file. So for instance, if your file was 10 UTC to 13 UTC, you could read in only the times between 11 UTC and 12 UTC. This would be useful for the 2DS and HVPS readers, since they have to be narrowed down to only liquid times before analysis.

2DVD spectral filtering (and more general spectral filtering in general)

Need to make sure the spectral handling works with 2DVD as well as making sure it can be made more general.

Support for 2DS/HVPS instrument data

This is now the 2DS only issue.

It would be great if a reader could be added to read in and process 2DS and HVPS .txt files, as these have similar measurements to disdrometers.

2DS: This is split into 2 files, one for Horizontal and one for Vertical, .txt file
-Time is recorded in seconds since 00 UTC, dividing by 3600. puts it into hours.
-The concentration column is the total concentration of drops at that particular time.
-All columns where the header begins C: are the concentrations of drops in each bin, in #/L/um.
-The spread of the bins is variable.
I believe the data is also tab separated. The example shown I poorly converted to an excel spreadsheet to get a better image of the headers.

HVPS: Vertical only, .txt file
Similar columns and locations to the 2DS data, but the bin sizes are different, as this instrument measures larger particles.

It would be nice if temperature could be set manually, as well as the time chosen to evaluate. Choosing the times to start and end evaluation would allow the user to choose only the time period where the drops were liquid.

Plotting Support

While PyDisdrometer has some basic plotting routines, I plan on going through and adding more before the v1.0 release. Please comment here on any routines in particular you would like to see.

Filter size bins

The DSD object needs the ability to filter out size columns from the drop spectra and from N(D). This helps with QC and cross comparisons.

File Reader Unit Tests

I'm going to create issues for each of the remaining blocks to a v1.0 release.
First is file reader unit/functional tests. All file readers need at least a basic example file/functional test to make sure they don't get broken.

If anyone wants to put together a small example file with 5 time samples and PR it in with a note as to what reader it belongs to, that would make my life easier. I'll also accept functional PRs that test the whole reading process for each reader.

Not Fully Self Describing

Insofar as I can tell the data model is not full self describing..

I think we need a self.time_start (as a datetime object) and a self.location = lat,lon

while these could go in notes I think basic usability needs them to be exposed on the base class.

DSD object structure

I thought I'd bring this offline conversation into issues to record for future changes and so we know why we chose a certain design.

Currently DropSizeDisdrometer objects store data in the class (e.g. dsd.time). I propose the use of dictionaries to store the data, and specifically dsd.fields. By also storing the data as dictionaries, this would allow metadata to be stored as well (especially since there is not a standard units used in the field of study). In addition, the derived fields could be stored under dsd.derived_fields, which would clearly separate which fields were derived from PyDisdrometer.

Secondly, I would suggest standardizing time units within the package. I convert all time units to Epoch time in this package for easier plotting routines.

Read JWD and plot time

To plot time in file along x-axis i changed
import numpy as np
from DropSizeDistribution import DropSizeDistribution
import datetime
import pandas as pd
UTC_OFFSET_TIMEDELTA = datetime.timedelta(0, 19800) # 5 X 3600 + 30 X 60 Indian time offset used to convert time in IST to UTC.

def read_jwd_utc(filename):
'''
Takes a filename pointing to a Joss-WaldVogel file and returns
a drop size distribution object.

Usage:
dsd = read_jwd(filename)

Returns:
DropSizeDistrometer object

'''

reader = JWDReader_utc(filename)
dsd = DropSizeDistribution(reader.time, reader.Nd, spread=reader.spread,
                           diameter=reader.diameter, rain_rate=reader.rain_rate,
                           bin_edges=reader.bin_edges)
return dsd

class JWDReader_utc(object):

'''
JWDReader class takes takes a filename as it's only argument(for now).
This should be a Joss-Waldvogel datafile.
'''
diameter = np.array([
    0.359, 0.455, 0.551, 0.656, 0.771, 0.913, 1.116, 1.331, 1.506, 1.665,
    1.912, 2.259, 2.584, 2.869, 3.198, 3.544, 3.916, 4.350, 4.859, 5.373])

spread = np.array([
    0.092, 0.100, 0.091, 0.119, 0.112, 0.172, 0.233, 0.197, 0.153, 0.166,
    0.329, 0.364, 0.286, 0.284, 0.374, 0.319, 0.423, 0.446, 0.572, 0.455])

def __init__(self, filename):
    self.filename = filename
    self.rain_rate = []

    self.Nd = []
    self.time = []

    self._read_file()
    self._prep_data()

    self.bin_edges = np.hstack(
        (0, self.diameter + np.array(self.spread) / 2))



def conv_md_to_nd(self, Nd):
    F = 0.005
    t = 30.0
    v = 9.65 - 10.3 * np.exp(-0.6 * self.diameter)
    return np.divide(Nd, (F * t * np.multiply(v, self.spread)))

def _read_file(self):
    with open(self.filename) as f:
        next(f)
        for i,line in enumerate(f):
            self.time.append(datetime.datetime.strptime((line.split('\t')[0]+line.split('\t')[1]),'%Y-%m-%d%H:%M:%S')-UTC_OFFSET_TIMEDELTA)



def _prep_data(self):
    self.Nd = np.array(self.Nd)
    self.time = pd.Series(np.array(self.time))
    #self.time = self.time - self.time[0]
    self.rain_rate = np.array(self.rain_rate)

ATM4.txt files processing

Hi! PyDSD seems a good tool for DSD processing and very much working when I tried it. However, what I have are files for every minute of disdrometer data. I would like to ask if it is possible to process multiple files of parsivel disdrometer in the form of ATM4.txt using PyDSD? Thank you very much!

Customization

If possible, it would be useful to pick the temperature used, as well as the times used in analysis.

Missing fit module from setup.py

When installing PyDSD from source, the fit module is missing from the installation. It should be added in the setup.py.

End of Life for python 2.7

@nguy , I am thinking of announcing an end of life for python 2.7 at the end of the year (or earlier?). It mirrors the move by a lot of projects to cut support and it would make some maintenance tasks a fair bit easier. Thoughts?

Vertical incidance

I am using radar data in vertical pointing mode so how can i perform scattering simulations in case of vertical incidence ?

Thank you

DSD calculation

Port over the UA, TS calculations, etc.

W band scattering fails

I have been trying to scatter particles at W band and it fails (kernel dies, or python restarts). I have tried narrowing it down to a particular drop size but have had no luck. I have attached a "short" file from the RACORO field campaign of a DSD of just one time step that failed (this must be read in using read_racoro, which I just submitted a pull request for). I also tested the W-band scattering on an APU file from IFloodS, and it failed as well. Do you know what is causing this failure?
IFloodS_APU11_2013_0529_dsd.txt
racoro2ds_short.txt

calculate_R_Zh_relationship call error

I ran dsd.calculate_R_Zh_relationship() and got the following error:

TypeError Traceback (most recent call last)
in ()
----> 1 dsd.calculate_R_Zh_relationship()

/Users/matt493/PyDSD/pydsd/DropSizeDistribution.pyc in calculate_R_Zh_relationship(self)
439
440 popt, pcov = expfit(
--> 441 np.power(10, 0.1 * self.fields["Zh"]["data"][self.rain_rate["data"] > 0]),
442 self.fields["rain_rate"]["data"][self.fields["rain_rate"]["data"] > 0],
443 )

TypeError: 'NoneType' object has no attribute 'getitem'

It seems this is occurring because self.rain_rate doesn't exist. It is now in the "fields", so should be called by: self.fields['rain_rate']['data'] instead.

Save Scattering Matrix to a file

We need support for saving the scattering matrices to a file, and reloading these. Under the hood this is supported by PyTMatrix so we just need to tap into that. When processing a large number of files this can significantly reduce the computational overhead.

calculate_dsd_parametrization returns empty 'D0' (all nan)

Hello josephhardinee,

Thank you for this excellent project. I have some Parsivel data and calculate_dsd_parametrization returns empty (all nan) field for D0 all the time. Is this something known to anyone else? I can send an example data if need be. All the other parameters in ['D0', 'Dmax', 'Dm', 'Nt', 'Nw', 'N0', 'W', 'mu'] seem fine.

Could it possibly be related to the use of numpy functions like cumsum instead of nancumsum on the masked Nd data?

I get a"Warning message" that

UserWarning: Warning: converting a masked element to nan.
return array(a, dtype, copy=False, order=order)
/home/dagnachew/anaconda3/lib/python3.6/site-packages/pydsd/DropSizeDistribution.py:349: RuntimeWarning: invalid value encountered in less
cross_pt = list(cum_W < (cum_W[-1] * 0.5)).index(False) - 1

(Pdb) c
/home/dagnachew/anaconda3/lib/python3.6/site-packages/pytmatrix/psd.py:183: RuntimeWarning: invalid value encountered in greater
psd[(D > self.D_max) | (D==0.0)] = 0.0

On Ubuntu 16. Python 3.6 (Anaconda)

Thanks

read multiple files of 2DVD

How can we read more than one files simultaneously and combine data to plot combined total drop size distribution.

File Saving

PyDisdrometer needs the ability to save out files into a standard format. While I have part of this started, if someone beats me to it I will accept it. The plan is it should be a mostly CF compliant netcdf file that is similar to the internal structure we currently use. If anyone has coments on this, please feel free to place below.

ts_utility

Python3.5 anaconda
install from git clone
Getting error:

/Users/scollis/anaconda/envs/pyart35/lib/python3.5/site-packages/PyDisdrometer-0.1.15.1-py3.5.egg/pydisdrometer/utility/__init__.py in <module>()
----> 1 from ts_utility import rolling_window

ImportError: No module named 'ts_utility'

2DVD support in PyDisdrometer?

The APU data works well with PyDisdrometer -- how about adding support for 2DVD data as well? Thanks!

rain rate estimation

It would be nice to have the double-moment rain estimators from the disdrometer data as well, R(Kdp,Zdr) and R(Zh, Zdr). Thanks!

Incorporating Xarray

I would like to incorporate XArray into PyDSD to make many of the operations much easier (In particular, netCDF writing).
Here is my proposal, I would like some feedback.

All entries in fields become DataArrays. All coordinates will be stored in a DropSizeDistribution.coords. We will add properties for time, diameter, velocity bin to point into the coords dictionary. Most things will be able to stay the same as far as how they work. The one bigger change is the values would now be accessed as something like DropSizeDistribution['Nd'].values instead of by ['data'].

Benefits: We can autoroll datasets and make writing much easier. Xarray objects are easier to work with and this opens the possibility of possibly adding some dask integrations as well.

@nguy and anyone else that has a vested interest, I would love to get your feedback. The actual work to write the changes will be pretty quick I believe.

One big thing I have not figured out how to address is making sure all of the lengths stay in sync. This is a problem with our current implementation as well though so we don't lose anything here.

HVPS reader

Data is similar to the 2DS data in set up (similar columns and units). However, the bin headers are broken, and must be read in from a separate file.
HVPS data:
https://www.dropbox.com/s/qdb9y0ts4nf8y05/HVPS_20150205a_V_M1ice.txt.gz?dl=0
HVPS bins:
https://www.dropbox.com/s/tmui3kpk6lrywuz/HVPS_Bins.csv?dl=0

Splitting apart DropSizeDistribution

@nguy I've been tossing around the idea for a while of pulling out a lot of the operations from DropSizeDistribution object and moving them into library functions. This would simplify the DropSizeDistribution object quite a bit, and make testing easier. It would also make it easier to do things like have an operation on a dsd object (like a select) return a new dsd object.
The functionality I had in mind was moving scattering to it's own submodule, and similarly for the rain fits that have been in the object just as a part of it's legacy.

Import

File "/home/yogesh/anaconda/lib/python2.7/site-packages/pydisdrometer/init.py", line 12, in
from . import partition

ImportError: cannot import name partition

Port pyparticleprobe DSD parameter calculations

Port over the Ulbrich and Atlas 1998 and Tokay and Short 1996 gamma distribution parameters, as well as exponential distribution parameter calculations.

Removing drops from the DSD

Can I remove drops from the DSD if their fall velocity is ±50% of the empirical value from Beard (1976) theoretical velocity equation using PyDSD?

Feature Request: Build a DSD from non-supported format

I appreciate all the built in file readers, but for instances where I apply qc to raw parsivel data and then try to read it in, I need to convert to something akin to ARM's formatting:

def read_precip(nc):
    try:
        nc_temp.close()
        os.remove("temp.nc")
    except:
        pass
    nc_temp = Dataset("temp.nc","w")

    time = nc_temp.createDimension("time",1440)
    particle_size = nc_temp.createDimension("particle_size",32)
    raw_fall_velocity = nc_temp.createDimension("raw_fall_velocity",32)

    time_ = nc_temp.createVariable("time","f4",("time")); 
    time_.units = "seconds since 1970-01-01 00:00:00 0:00"
    time_[:] = date2num(num2date(nc.variables["time"][:],nc.variables["time"].units),units=time_.units)

    Nd_ = nc_temp.createVariable("number_density_drops","f4",("time","particle_size"));
    Nd_[:] = np.nansum(nc.variables["num_conc_rain"][:],axis = 1)

    vel_ = nc_temp.createVariable("fall_velocity_calculated","f4",("particle_size"));
    vel_[:] = v_gk(nc.variables["particle_size"][:])/100

    rr_ = nc_temp.createVariable("precip_rate","f4",("time"));
    rr_[:] = nc.variables["rain_rate_qc"][:]

    spec_ = nc_temp.createVariable("raw_spectrum","f4",("time","particle_size","raw_fall_velocity"))
    spec_[:] = np.moveaxis(nc.variables["qc_spectrum"][:], 1, 2)

    raw_velo_ = nc_temp.createVariable("raw_fall_velocity","f4",("raw_fall_velocity"))
    raw_velo_[:] = nc.variables["raw_fall_velocity"][:]

    dia_ = nc_temp.createVariable("particle_size","f4",("particle_size"))
    dia_[:] = nc.variables["particle_size"][:]

    width_ = nc_temp.createVariable("class_size_width","f4",("particle_size"))
    width_[:] = nc.variables["particle_size_bin_width"][:]
    
    try:
        return(pydsd.read_parsivel_arm_netcdf("temp.nc"))
    finally:
        nc_temp.close()
        os.remove("temp.nc")

This isn't a huge problem, but being able to just build the dsd object with fields from non-supported file types would be great.

'nd' data in ParsivelReader class

I notice that the 'nd' data is powered by 10 in function 'read_file' of ParsivelReader class. What is the reason for that? According to the manual of Parsivel Disdropmeter ,the data following code '90' is exactly the 'nd' data, which means number of drops in 1 m**3 volumn and 1mm drop diameter spread.Am i misunderstanding anything?

Rain rate support

I'd like to request support for reading in rain rates from NASA GV data (JW and 2DVD).

Is it possible to feed data with different format?

I have parsivel and thies lpm data and want to use PyDSD to analysis. But my data files is slightly different then the one presented here as test data. I tried to load using package but got error. If there is a way to feed this data then it will be very helpful. For reference I have also uploaded the sample data here with a link https://drive.google.com/drive/folders/1wNPYCEzhoq2gFVm98JZQpKTLP5V9KbSm?usp=sharing

Ability to select scattering range by size

The 2DVD instruments, as well as others, have diameter bins that get fairly large. This can cause issues for the underlying scattering code. We need to implement a feature that allows a max effective scattering parameter to be introduced to the object. Something like

dsd.set_scattering_limits(0, 8)

. We do need to discuss whether this should affect the dsd parameterization code as well or be limited to just the scattering. @nguy, your thoughts?

Translate time to a package standard

Convert all times to a standardized Epoch time - '1970-1-1T00:00:00+0:00'

How can i add new DSR function

I have calculated new axis ratio for our site and want to use that for calculate radar parameters so to test performance of new equation i replace new equation in DSR.py module in the def bc(D_eq) function. But when i run command "dsd.calculate_radar_parameters" then python kernel is dying
def bc(D_eq):
'''Beard and Chuang Drop Shape relationship model.

Implementation of the Beard and Chuang Drop Shape model given in [1]
. This gives the ratio of the major to minor axis.

Parameters
-----------
D_eq: float
    Volume Equivalent Drop Diameter

Returns
-------
axis_ratio: float
    The ratio of the major to minor axis.

See Also
--------
tb : Thurai and Bringi DSR
pb : Pruppacher and Beard DSR

References
----------
..[1]  Beard, Kenneth V., Catherine Chuang, 1987: A New Model for the
Equilibrium Shape of Raindrops. J. Atmos. Sci., 44, 1509-1524.
'''

return 1.0048 + 5.7e-04 * np.power(D_eq, 1) - \

#2.628e-02 * np.power(D_eq, 2) + 3.682e-03 * np.power(D_eq, 3) -
#1.677e-04 * np.power(D_eq, 4)

Monsoon drop shape relationship for 0 < Deq <8

return 0.1479 + 2.413_(D_eq) - 2.194_(D_eq2) + 0.7531(D_eq3) - 0.08305_(D_eq_4)

Monsoon drop shape relationship for 1 < Deq <8

return 0.9946 + 0.02297 * D_eq - 0.02051 * D_eq ** 2 + 0.002795 * D_eq ** 3 -\
    0.0001849 * D_eq ** 4

Time Handling

I wanted to write this down since we've gone back and forth a bit (or at least I have).
I'm going to modify all the readers to have the following time field. Time will be stored as a series of integers representing seconds since 1970-1-1 (epoch time). So
time['data'] = array([ 1.32332381e+09, 1.32332382e+09, 1.32332383e+09]) for instance. It will have a units string set as well.
'units': 'seconds since 1970-1-1 00:00:00+0:00'

Currently we have a mix of epoch time, time offset from beginning of file, and datetime instances.
Does this work for you @nguy ? I will try to route much of our time handling through some common functions so that if we change our mind in the future, its an easier update than modifying each reader individually. I think this reflects what we discussed at AMS,

Terminal Fall Speed Adjustment

The DropSizeDistribution object needs support for adjusting the terminal fall speed based on air density, or a user supplied fall speed. We can start by implementing the Gunn/Kinzer adjustment. This should only affect the terminal fall speed when calculated by formula, not the measured fall speed. In those cases it would also only propagate along to rain rate I believe.

Feature Request: Doppler Velocity Scattering

We've had a request internally to add the ability to calculate the reflectivity weighted doppler velocity from the DSD. Two things are needed:

1. Calculate velocity from N(D)
2. Calculate velocity from spectrum (N(D, V)).

Not sure I'll implement the second just yet, but the first should be straight forward.

Readme.md could use rewriting

The front page readme.md needs updating. It should include some of the new functionality, as well as just being better organized.
Good beginner issue.

Spresd

Hello Sir,
In calculation of DSD i dont understand the term "spread" , is id difference between two diameter interval?
def conv_md_to_nd(self, Nd):
F = 0.005
t = 30.0
v = 9.65 - 10.3 * np.exp(-0.6 * self.diameter)
return np.divide(Nd, (F * t * np.multiply(v, self.spread)))
can you explain ?

Support more universal reading of droplet spectrum

We have rudimentary support for the droplet spectrum (rather than just N(D)). I'll be expanding this out to help with some support for various ARM VAPs. There will be three main components to this.

More universal support for reading droplet spectrums
Droplet spectrum filtering support
Reduction of Droplet spectrum back to N(D)

josephhardinee / pydsd Goto Github PK

pydsd's People

Contributors

Stargazers

Watchers

Forkers

pydsd's Issues

I ran dsd.calculate_R_Zh_relationship() and got the following error:

return 1.0048 + 5.7e-04 * np.power(D_eq, 1) - \

Monsoon drop shape relationship for 0 < Deq <8

return 0.1479 + 2.413_(D_eq) - 2.194_(D_eq2) + 0.7531*(D_eq3) - 0.08305_(D_eq_*4)

Monsoon drop shape relationship for 1 < Deq <8

Recommend Projects

Recommend Topics

Recommend Org

return 0.1479 + 2.413_(D_eq) - 2.194_(D_eq2) + 0.7531(D_eq3) - 0.08305_(D_eq_4)