Tutorial of Python for Seismologists

Motivation

Motivation of writing this article is telling seismologists what python can do in seismological research. I strongly recommend you to use Python as your primary language. As life is short, we should use python.

Basic Python

In [88]:
# basic variable types of python
int_num, float_num, string, dictionary, list_type = 1, 2.0, "string", {"key":"value"}, [1, 2.0, "string", {"key":"value"}]
logical_var = True

# just print can output any data type 
print(float(int_num), str(float_num), dictionary, type(list_type))
1.0 2.0 {'key': 'value'} <class 'list'>
In [89]:
# only the list and dictionary are adjustable
dictionary.update({3:4});list_type.append("appended");
print(dictionary, list_type)
{'key': 'value', 3: 4} [1, 2.0, 'string', {'key': 'value'}, 'appended']
In [90]:
# functions in python
def summation_func(frt, second=100, third=None):
    """Demo function
    
    Parameter
    =========
    frt : str
        fisr variable is location variable
    second : int
        second variable has it's own default value which is 100 (keywords  variable)
    third : int
        second variable also has it's own default value (keywords variable)
       
    """
    if frt is 1:
        print("First variable is {}".format(1))
    else:
        print("First variable is not {}".format(1))
        
    if frt in [1,2,3]:
        print("first variable is included in {}".format([1,2,3]))
    
    summation = frt + second + third
    return summation
In [71]:
summation = summation_func(frt=2, second=2, third=3);print(summation)
First variable is not 1
first variable is included in [1, 2, 3]
7
In [17]:
# Python is an object-oriented language
class student(object):
    """a kind of class to represent students
    """
    def __init__(self, name="Dong Dong", score=88):
        """
        Initialize a student with it's name and score
        
        Parameter
        =========
        name : str
            name of this student
        score : float
            score of it
        """
        self.name = name
        self.score = score
        
    def change_score(self, newscore):
        """
        Change score of this student
        
        Parameter
        =========
        newscore : float
            new score to be changed
        """
        self.score = newscore
        print("score of this student has been changed to {}".format(newscore))
In [19]:
# a kind of class can have multiple instances
dongdong = student(name="DongDong", score=90)
huazai = student(name="huazai", score=92)
print(dongdong.score, huazai.score)
90 92
In [20]:
# a class can bundle multiple methods
dongdong.change_score(newscore=95);print(dongdong.score)
score of this student has been changed to 95
95
In [25]:
# numerous modules to import
import numpy as np
numpy_data = np.array([1,2,3]);print(type(numpy_data))
<class 'numpy.ndarray'>
In [26]:
# this class bundles many powerful methods
summation = numpy_data.sum();print(summation)
6
In [31]:
# numpy also defined another powerful class `matrix`
matrix_demo = np.matrix([[1,2],[2,3]]);print(type(matrix_demo))
print(matrix_demo)
<class 'numpy.matrixlib.defmatrix.matrix'>
[[1 2]
 [2 3]]
In [34]:
# all matrix calculation should be done with this class
matrix_demo2 = np.matrix([[1,2],[2,3]]);
matrix_demo * matrix_demo2

# explain multiple methods bunddled with this class
Out[34]:
matrix([[ 5,  8],
        [ 8, 13]])

Bridge between Seismology and Scientific Python Ecosystem--ObsPy

In [91]:
# import seismic data and do some processing 
# This demo comes from tutorial of ObsPy

from obspy import UTCDateTime

t1 = UTCDateTime("2010-09-3T16:30:00.000")
t2 = UTCDateTime("2010-09-3T17:00:00.000")
print(type(t1));print(t1-100);print(t1.month, t1.day, t1.hour)
<class 'obspy.core.utcdatetime.UTCDateTime'>
2010-09-03T16:28:20.000000Z
9 3 16
In [93]:
from obspy.clients.fdsn import Client
fdsn_client = Client('IRIS')
# Fetch waveform from IRIS FDSN web service into a ObsPy stream object
# and automatically attach correct response
st = fdsn_client.get_waveforms(network='NZ', station='BFZ', location='10',
                               channel='HH?', starttime=t1, endtime=t2,
                               attach_response=True)
st.plot()
In [47]:
# Many methods are bundled with stream class such as :
st.trim(t1+300, t1+2500);st.plot();
In [101]:
# define a filter band to prevent amplifying noise during the deconvolution
pre_filt = (0.005, 0.006, 30.0, 35.0)
st.remove_response(output='DISP', pre_filt=pre_filt);st.plot();st.decimate(factor=10);st.decimate(factor=10);
In [102]:
# stream is a combination of traces
tr = st[0];tr.plot()
In [103]:
from termcolor import colored
print(tr);print(colored(tr.stats, "red"));print(tr.data);print(colored(type(tr.data), "blue"));print(colored(tr.stats.station, "yellow"))
NZ.BFZ.10.HHE | 2010-09-03T16:30:00.008393Z - 2010-09-03T16:59:59.008393Z | 1.0 Hz, 1800 samples
               network: NZ
               station: BFZ
              location: 10
               channel: HHE
             starttime: 2010-09-03T16:30:00.008393Z
               endtime: 2010-09-03T16:59:59.008393Z
         sampling_rate: 1.0
                 delta: 1.0
                  npts: 1800
                 calib: 1.0
_fdsnws_dataselect_url: http://service.iris.edu/fdsnws/dataselect/1/query
               _format: MSEED
                 mseed: AttribDict({'dataquality': 'M', 'number_of_records': 928, 'encoding': 'STEIM2', 'byteorder': '>', 'record_length': 512, 'filesize': 1394176})
            processing: ["ObsPy 1.1.0: remove_response(fig=None::inventory=None::output='DISP'::plot=False::pre_filt=(0.005, 0.006, 30.0, 35.0)::taper=True::taper_fraction=0.05::water_level=60::zero_mean=True)", "ObsPy 1.1.0: remove_response(fig=None::inventory=None::output='DISP'::plot=False::pre_filt=(0.005, 0.006, 30.0, 35.0)::taper=True::taper_fraction=0.05::water_level=60::zero_mean=True)", "ObsPy 1.1.0: filter(options={'freq': 5.0, 'maxorder': 12}::type='lowpass_cheby_2')", 'ObsPy 1.1.0: decimate(factor=10::no_filter=False::strict_length=False)', "ObsPy 1.1.0: filter(options={'freq': 0.5, 'maxorder': 12}::type='lowpass_cheby_2')", 'ObsPy 1.1.0: decimate(factor=10::no_filter=False::strict_length=False)']
              response: Channel Response
	From M/S (Velocity in Meters Per Second) to COUNT (Digital Counts)
	Overall Sensitivity: 2.51658e+09 defined at 1.000 Hz
	3 stages:
		Stage 1: PolesZerosResponseStage from M/S to V, gain: 1500
		Stage 2: CoefficientsTypeResponseStage from V to COUNT, gain: 1.67772e+06
		Stage 3: FIRResponseStage from COUNT to COUNT, gain: 1
[ -3.68761247e-24  -1.69445003e-17  -4.20143551e-16 ...,  -4.01860648e-15
  -2.88365843e-15  -1.60807516e-15]
<class 'numpy.ndarray'>
BFZ

Scientific Python Ecosystem for Seismologists

In [115]:
# Do some data process with scipy

# now, let's do something special with scientific python libraray\
# We simplely use FFT package from scipy to analysis main energe of this signal
# Demo comes from reference mannual of scipy

from scipy.fftpack import fft
data = tr.data; sampling_rate = tr.stats.sampling_rate;
spectrum_data = fft(data)
deltaT = 1.0 / sampling_rate

# add a filter for this signal
N = len(data)   # data point number
from scipy.signal import gaussian
w = gaussian(N, std=N * 0.5)
ywf = fft(spectrum_data * w)
xf = np.linspace(0.0, 1.0/(2.0 * deltaT), N/2)
E:\anaconda\lib\site-packages\ipykernel_launcher.py:14: DeprecationWarning: object of type <class 'float'> cannot be safely interpreted as an integer.
  
In [116]:
# Visualization part
import matplotlib.pyplot as plt
plt.semilogy(xf[1:N//2], 2.0/N * np.abs(spectrum_data[1:N//2]), '-b')
plt.semilogy(xf[1:N//2], 2.0/N * np.abs(ywf[1:N//2]), '-r')
plt.legend(['FFT (Raw waveform)', 'FFT w. window'])
plt.grid()
plt.show()
In [122]:
# Demo of powerful and smart visualization part
# Based on simple easy setting, we can obtain some beautiful figures to publish or report 
# Demo comes from : https://www.safaribooksonline.com/library/view/python-data-science/9781491912126/ch04.html

from sklearn.datasets import load_iris
import seaborn

iris = load_iris()
features = iris.data.T

plt.scatter(features[0], features[1], alpha=0.2,
            s=100*features[3], c=iris.target, cmap='viridis')
plt.xlabel(iris.feature_names[0])
plt.ylabel(iris.feature_names[1])
plt.show()