Giter Club home page Giter Club logo

seisgo's Introduction

SeisGo

A ready-to-go Python toolbox for seismic data analysis

plot1

Introduction

This package is currently heavily dependent on obspy (www.obspy.org) to handle seismic data (download, read, and write, etc). Users are referred to obspy toolbox for related functions.

Available modules

This package is under active development. The currently available modules are listed here.

  1. utils: This module contains frequently used utility functions not readily available in obspy.

  2. downloaders: This module contains functions used to download earthquake waveforms, earthquake catalogs, station information, continous waveforms, and read data from local files.

  3. obsmaster: This module contains functions to get and processing Ocean Bottom Seismometer (OBS) data. The functions and main processing modules for removing the tilt and compliance noises are inspired and modified from OBStools (https://github.com/nfsi-canada/OBStools) developed by Pascal Audet & Helen Janiszewski. The main tilt and compliance removal method is based on Janiszewski et al. (2019).

  4. noise: This module contains functions used in ambient noise processing, including cross-correlations and monitoring. The key functions were converted from NoisePy (https://github.com/mdenolle/NoisePy) with heavy modifications. Inspired by SeisNoise.jl (https://github.com/tclements/SeisNoise.jl), We modified the cross-correlation workflow with FFTData and CorrData (defined in types module) objects. The original NoisePy script for cross-correlations have been disassembled and wrapped in functions, primarily in this module. We also changed the way NoisePy handles timestamps when cross-correlating. This change results in more data, even with gaps. The xcorr functionality in SeisGo also has the minimum requirement on knowledge about the downloading step. We try to optimize and minimize inputs from the user. We added functionality to better manipulate the temporal resolution of xcorr results.

  5. plotting: This module contains major plotting functions for raw waveforms, cross-correlation results, and station maps.

  6. monitoring: This module contains functions for ambient noise seismic monitoring, adapted from functions by Yuan et al. (2021).

  7. clustering: Clustering functions for seismic data and velocity models.

  8. stacking: stacking of seismic data.

  9. types: This module contains the definition of major data types and classes.

Installation

SeisGo is available on PyPi (https://pypi.org/project/seisgo/). You can install it as a regular package pip install seisgo. The following instruction shows how to install seisgo with a virtual environment with github repository.

  1. Create and activate the conda seisgo environment

Make sure you have a working Anaconda installed. This step is required to have all dependencies installed for the package. You can also manually install the listed packages without creating the seisgo environment OR if you already have these packages installed. The order of the following commands MATTERS.

$ conda create -n seisgo -c conda-forge jupyter numpy scipy pandas numba pycwt python cartopy obspy mpi4py stockwell
$ conda activate seisgo

The jupyter package is currently not required, unless you plan to run the accompanied Jupyter notebooks in directory. mpi4py is required to run parallel scripts stored in scripts directory. The modules have been fully tested on python 3.7.x but versions >= 3.6 also seem to work from a few tests. You can create jupyter kernel:

$ conda activate seisgo
$ pip install --user ipykernel
$ python -m ipykernel install --user --name=seisgo

Install PyGMT plotting funcitons

Map views with geographical projections are plotted using PyGMT (https://www.pygmt.org/latest/). It seems that only PIP install could get the latest version [personal experience]. The following are steps to install PyGMT package (please refer to PyGMT webpage for trouble shooting and testing):

Install GMT through conda first into the SeisGo environment:

conda activate seisgo
conda config --prepend channels conda-forge
conda install  python pip numpy pandas xarray netcdf4 packaging gmt

You may need to specify the python version available on your environment. In ~/.bash_profile, add this line: export GMT_LIBRARY_PATH=$SEISGOROOT/lib, where $SEISGOROOT is the root directory of the seisgo environment. Then, run:

conda install pygmt

Test your installation by running:

python
> import pygmt
  1. Download SeisGo

cd to the directory you want to save the package files. Then,

$ git clone https://github.com/xtyangpsp/SeisGo.git
  1. Install seisgo package functions using pip

This step will install the SeisGo modules under seisgo environment. The modules would then be imported under any working directory. Remember to rerun this command if you modified the functions/modules.

$ pip install .
  1. Test the installation

Run the following commands to test your installation.

$ python
>>> from seisgo import obsmaster as obs
>>> tflist=obs.gettflist(help=True)
------------------------------------------------------------------
| Key    | Default  | Note                                       |
------------------------------------------------------------------
| ZP     | True     | Vertical and pressure                      |
| Z1     | True     | Vertical and horizontal-1                  |
| Z2-1   | True     | Vertical and horizontals (1 and 2)         |
| ZP-21  | True     | Vertical, pressure, and two horizontals    |
| ZH     | True     | Vertical and rotated horizontal            |
| ZP-H   | True     | Vertical, pressure, and rotated horizontal |
------------------------------------------------------------------
  1. Trouble shooting
  • Stockwell errors with numpy importing errors (numpy.core.multiarray errors) and/or fftw errors. Some libraries (e.g., FFTW) used by stockwell are only compatible with numpy <= 1.22.4.

Try the following steps:

$ conda activate seisgo
$ conda uninstall numpy stockwell
$ conda install -c conda-forge stockwell numpy==1.22.4
  • NETCDF4 errors. You may need to mannual install/reinstall netcdf4 by:
$ pip uninstall netcdf4
$ pip install netcdf4

Update SeisGo

If you installed SeisGo through github, run the following lines to update to the latest version (may not be a release on pip yet):

git pull
pip install .   #note there is a period "." sign here, indicating the current directory

If you installed SeisGo through pip, you can get the latest release (GitHub always has the most recent commits) though running these lines:

pip install seisgo --upgrade

Structure of the package

  1. seisgo: This directory contains the main modules.

  2. notebooks: This directory contains the jupyter notebooks that provide tutorials for all modules.

  3. data: Data for testing or running the tutorials is saved under this folder.

  4. figs: Here we put figures in tutorials and other places.

  5. scripts: This directory contains example scripts for data processing using seisgo. Users are welcome to modify from the provided example scripts to work on their own data.

Tutorials on key functionalities

  1. Download continuous waveforms for large-scale job (see item 3 for small jobs processing in memory). Example script using MPI is here: scripts/seisgo_download_MPI.py. The following lines show an example of the structure without MPI (so that you can easily test run it in Jupyter Notebook).
import os,glob
from seisgo.utils import split_datetimestr,extract_waveform,plot_trace
from seisgo import downloaders

rootpath = "data_test" # roothpath for the project
DATADIR  = os.path.join(rootpath,'Raw')          # where to store the downloaded data
down_list  = os.path.join(DATADIR,'station.txt') # CSV file for station location info

# download parameters
source='IRIS'
samp_freq = 10                      # targeted sampling rate at X samples per seconds
rmresp   = True
rmresp_out = 'DISP'

# targeted region/station information
lamin,lamax,lomin,lomax= 39,41,-88,-86           # regional box:
net_list  = ["TA"] #                              # network list
chan_list = ["BHZ"]
sta_list  = ["O45A","SFIN"]                       # station
start_date = "2012_01_01_0_0_0"                   # start date of download
end_date   = "2012_01_02_1_0_0"                   # end date of download
inc_hours  = 12                                   # length of data for each request (in hour)
maxseischan = 1                                   # the maximum number of seismic channels
ncomp      = maxseischan #len(chan_list)

downlist_kwargs = {"source":source, 'net_list':net_list, "sta_list":sta_list, "chan_list":chan_list, \
                   "starttime":start_date, "endtime":end_date, "maxseischan":maxseischan, "lamin":lamin, \
                   "lamax":lamax,"lomin":lomin, "lomax":lomax, "fname":down_list}

stalist=downloaders.get_sta_list(**downlist_kwargs) #
#this is a critical step for long duration downloading, as a demo here.
all_chunk = split_datetimestr(start_date,end_date,inc_hours)

#################DOWNLOAD SECTION#######################
for ick in range(len(all_chunk)-1):
   s1= all_chunk[ick];s2=all_chunk[ick+1]
   print('time segment:'+s1+' to '+s2)
   downloaders.download(source=source,rawdatadir=DATADIR,starttime=s1,endtime=s2,\
                      stationinfo=stalist,samp_freq=samp_freq)

print('downloading finished.')

#extrace waveforms
tr=extract_waveform(glob.glob(os.path.join(DATADIR,"*.h5"))[0],net_list[0],sta_list[0],comp=chan_list[0])
plot_trace([tr],size=(10,4),ylabels=['displacement'],title=[net_list[0]+'.'+sta_list[0]+'.'+chan_list[0]])

You should see the following image showing the waveform for TA.O45A. plot1

  1. Download earthquake catalog and waveforms with given window length relative to phase arrivals TBA.

  2. Ambient noise cross-correlations

  • Minimum lines version for processing small data sets in memory. Another example is in notebooks/seisgo_download_xcorr_demo.ipynb.
from seisgo import downloaders
from seisgo.noise import compute_fft,correlate

# download parameters
source='IRIS'                                 # client/data center. see https://docs.obspy.org/packages/obspy.clients.fdsn.html for a list
samp_freq = 10                                                  # targeted sampling rate at X samples per seconds

chan_list = ["BHZ","BHZ"]
net_list  = ["TA","TA"] #                                             # network list
sta_list  = ["O45A","SFIN"]                                               # station (using a station list is way either compared to specifying stations one by one)
start_date = "2012_01_01_0_0_0"                               # start date of download
end_date   = "2012_01_02_1_0_0"                               # end date of download

# Download
print('downloading ...')
trall,stainv_all=downloaders.download(source=source,starttime=start_date,endtime=end_date,\
                                  network=net_list,station=sta_list,channel=chan_list,samp_freq=samp_freq)

print('cross-correlation ...')
cc_len    = 1800                                                            # basic unit of data length for fft (sec)
cc_step      = 900                                                             # overlapping between each cc_len (sec)
maxlag         = 100                                                        # lags of cross-correlation to save (sec)

#get FFT
fftdata1=compute_fft(trall[0],cc_len,cc_step,stainv=stainv_all[0])
fftdata2=compute_fft(trall[1],cc_len,cc_step,stainv=stainv_all[1])

#do correlation
corrdata=correlate(fftdata1,fftdata2,maxlag,substack=True)

#plot correlation results
corrdata.plot(freqmin=0.1,freqmax=1,lag=100)

You should get the following figure: plot1

  • Run large-scale jobs through MPI. For processing of large datasets, the downloaded and xcorr data will be saved to disk. Example script here: scripts/seisgo_xcorr_MPI.py

Contribute

Any bugs and ideas are welcome. Please file an issue through GitHub.

References

  • Bell, S. W., D. W. Forsyth, & Y. Ruan (2015), Removing Noise from the Vertical Component Records of Ocean-Bottom Seismometers: Results from Year One of the Cascadia Initiative, Bull. Seismol. Soc. Am., 105(1), 300-313, doi:10.1785/0120140054.
  • Clements, T., & Denolle, M. A. (2020). SeisNoise.jl: Ambient Seismic Noise Cross Correlation on the CPU and GPU in Julia. Seismological Research Letters. https://doi.org/10.1785/0220200192
  • Janiszewski, H A, J B Gaherty, G A Abers, H Gao, Z C Eilon, Amphibious surface-wave phase-velocity measurements of the Cascadia subduction zone, Geophysical Journal International, Volume 217, Issue 3, June 2019, Pages 1929-1948, https://doi.org/10.1093/gji/ggz051
  • Jiang, C., & Denolle, M. A. (2020). NoisePy: A New High-Performance Python Tool for Ambient-Noise Seismology. Seismological Research Letters. https://doi.org/10.1785/0220190364
  • Tian, Y., & M. H. Ritzwoller (2017), Improving ambient noise cross-correlations in the noisy ocean bottom environment of the Juan de Fuca plate, Geophys. J. Int., 210(3), 1787-1805, doi:10.1093/gji/ggx281.
  • Yuan, C., Bryan, J., & Denolle, M. (2021). Numerical comparison of time-, frequency-, and wavelet-domain methods for coda wave interferometry. Geophysical Journal International, 828โ€“846. https://doi.org/10.1093/gji/ggab140
  • Yang, X., Bryan, J., Okubo, K., Jiang, C., Clements, T., & Denolle, M. A. (2022). Optimal stacking of noise cross-correlation functions. Geophysical Journal International, 232(3), 1600โ€“1618. https://doi.org/10.1093/gji/ggac410

seisgo's People

Contributors

izzy-z avatar xtyangpsp avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

seisgo's Issues

error when saving LARGE CorrData.time attribute

This is basically a HDF issue that only attributes < 64k can be saved. When saving CorrData as an auxiliary data, the time attribute, which is wrapped in parameter dictionary, might be too large to save in some cases. This may occur often when saving the merged data. No easy fix yet. A compromise is to reduce the length of the time attribute. The following lines will reproduce the error:

import pyasdf
import numpy as np
from obspy import UTCDateTime
data=np.random.rand(10000)
tt=np.float64(UTCDateTime('2021-01-01'))+1000*data
para={"time":tt}
with pyasdf.ASDFDataSet('test.h5',mpi=False) as ds:
    ds.add_auxiliary_data(data=data, data_type='test', path='random', parameters=para)

Here is the error:

RuntimeError: Unable to create attribute (object header message is too large)

optimize order of detrend and demean in slicing_trace

The current function slicing_trace() in utils.py does demean and detrend after finishing the slicing work. This might influence the trace_std. A more optimal order might be to do deman and detrend twice: one at the very beginning against the whole trace, another once before computing the trace std. We will need to test whether this change makes any meaningful difference in the sliced data and the correlations.

bugs/typo in noise.py

Line no 516: Indx = np.where(cc_comp==comp)[0]
It is not working as cc_comp is a list.
Line no 613: dstack[i,:] = stack.seisstack(cc_array,method=method,par=par)
method is a list here, but seisstack is expecting a string.
Moreover, In the function do_stacking, there is conflict between variable 'rotation' and function call to 'rotation'.

Note: This is the first time I am reporting any issue on this platform. So please accept my apology in advance if this is not the correct way of presenting.

loop through pairs in noise.shaping_corrdata

The current noise.shaping_corrdata only works on the first station pair, by assuming there is only one pair. This needs to be changed to loop through all pairs. Component option could also be changed accordingly to consider all data in the ccfile.

add rotate_corrdata

Add a function to rotate the components of a list of CorrData objects with all 9 component pairs. This capability is currently in the outdated do_stacking() function in noise.py. The stacking functionality is now replaced with merge_pair() function. The rotation functionality should be separated out from stacking.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.