Python Automation and Machine Learning for EM and ICs

An Online Book, First Edition by Dr. Yougui Liao (2019)

Python Automation and Machine Learning for EM and ICs - An Online Book

Chapter/Index: Introduction | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | Appendix

Python Installation

Many development environments are available for Python, for instance:
       i) PyDev with Eclipse
       ii) Emacs
       iii) Vim
       iv) TextMate
       v) Gedit
       vi) IDLE
       vii) PIDA (Linux)(VIM based)
       viii) NotePad++ (Windows)
       ix) BlueFish (Linux)
To be able to write a program in Python, the programmer can use Anaconda. Anaconda Python Distribution is probably what you should first be looking for because it comes bundled with almost everything that you would need to start your data science journey. Anaconda includes 100 of the most popular Python, R, and Scala packages for data science and several open source development environments, e.g. Jupyter Lab/Notebook and Spyder IDE.

Most popular Python IDE Region Most popular Python IDE Region
(a)
(b)

Figure 4849. Most popular Python IDE.

Note that for beginners, IDLE and Anaconda are enough for routine programming. The basic steps of the installation of Python and its additions are are described here.

To install Python, one first needs to go to www.python.org, and then do the following steps:
       i) Go to "Downloads", and then locate "Windows" (if you have a Windows system on your PC), and then click "View the full list of downloads":
       Python installation for electron microscopy analysis
       ii) Click the version you want to install:
       Python installation

Check the path box below at the initial installation:
          Error message when install seleniumbase

Immediately after installing Python program, it is better to check the Python path to make sure the environmental settings are correct. If it is not correct, then correct the python path:
        a) Search Python and then right mouse click "Open file location":
                Python Search
        b) Right mouse click "Python x.x" and then click "Open file location":
                Python Search
        c) You will see the "python.exe" location is different from the location above:
                Python Search
        d) Copy the location marked in red, which is "C:\Users\yyliao\AppData\Local\Programs\Python\Python39":
                Python Search
        e) Go to "run" by searching on windows "Search" function:
             setx PATH "%PATH%;
        f) Go to Advanced > Environment Variables:
             setx PATH "%PATH%;
        g) Go to the two "Path":
             setx PATH "%PATH%;
        h) Python is not in the box below, then need to type or paste the path (find it with "echo %PATH%" -- see ii.a)) there:
             setx PATH "%PATH%;
             setx PATH "%PATH%;                    
Set the PYTHONPATH on windows to point Python to look in other directories for module and package imports:
     i) Go to:
        My Computer > Properties > Advanced System Settings > Environment Variables
     ii) Under system variables edit the PythonPath variable. At the end of the current PYTHONPATH, add a semicolon and then the directory, then you add to this path:
        C:\Python27;C:\foo
In this case, the foo directory to the PYTHONPATH are added. Note that we are appending it and but not replacing the PYTHONPATH's original value. However, in most cases, you shouldn't mess with PYTHONPATH. More often than not, you are doing it wrong and it will only bring you into trouble.

Install pip:
     get-pip.py

Steps of installation of Numpy, Matplotlib, and Scipy:
       i) Locate the "Scripts" folder on your PC after the installation of Python:
       installation of Numpy, Matplotlib, and Scipy
       ii) Create a "txt" file in the folder:
       installation of Numpy, Matplotlib, and Scipy
       iii) Write "cmd" in the "local.txt" file and then save it:
       installation of Numpy, Matplotlib, and Scipy
       iv) Change the file type to ".bat" :
       installation of Numpy, Matplotlib, and Scipy
       v) Double-click the file "local.bat" and then the cmd.exe file will be opened:
       installation of Numpy, Matplotlib, and Scipy
       vi) Install Numpy python library:
       installation of Numpy, Matplotlib, and Scipy      
       NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.
       vii) Install Matplotlib python library:
       installation of Numpy, Matplotlib, and Scipy
       Matplotlib is a plotting library for the Python programming language and its numerical mathematics extension NumPy. It provides an object-oriented API for embedding plots into applications using general-purpose GUI toolkits like Tkinter, wxPython, Qt, or GTK.
       viii) Install Scipy python library:
       installation of Numpy, Matplotlib, and Scipy
       SciPy is a free and open-source Python library used for scientific computing and technical computing. SciPy contains modules for optimization, linear algebra, integration, interpolation, special functions, FFT, signal and image processing, ODE solvers and other tasks common in science and engineering.
       ix) Install HyperSpy python library:
       installation of Numpy, Matplotlib, and Scipy
       HyperSpy is an open source Python library which provides tools to facilitate the interactive data analysis of multi-dimensional datasets that can be described as multi-dimensional arrays of a given signal (e.g. a 2D array of spectra a.k.a spectrum image).
       x) Install Similar Image Finder (Simimg) library:
       Install Similar Image Finder (Simimg) library
       This is a python GUI for displaying pictures grouped according to similarity. The main aim of the program is to help identify groups of holiday snaps that resemble each-other and efficiently inspect those groups. It allows you to easily keep only the best photos. The program is not designed to identify the same but modified pictures (recompressed jpgs, cropped images or adapted colours, etc.).
       And then, install "pip install img2vec-pytorch":
       Install Similar Image Finder (Simimg) library
       x) Install opencv-python for cv2 module
       Install Similar Image Finder (Simimg) library

Some installations cannot be completed by the simple "pip install ..." due to net work restrictions. In such cases, the installations can be completed through different ways, for instance,
         i) "pip install beautifulsoup4" or "pip3.6 install beautifulsoup4" for beautifulsoup installation.
         ii) The following steps are used to install gensim:
                 easy_install -U gensim
                 pip install --upgrade gensim
         iii) py -3 -m pip install ppt

4849a. Conda and pip commands.

Action Conda pip 
List all installed packages  conda list pip freeze
Install the latest package version  conda install package pip install package
Install a specic package version  conda install package=1.0.0 pip install package==1.0.0
Update a package  conda update package pip install --upgrade package
Uninstall a package conda remove package pip uninstall package

Table 4849b. Some other installations.

Installations
Functions and notes
Python on Mac "python3 -m pip install xxx" instead of "pip install xxx" on Windows
logic pip install logic-py, and pip install logic
Check Python version used python.exe -V
pip install email ModuleNotFoundError: No module named 'email_pre'
pip freeze Show that library installed
pip install --upgrade pip Update pip
pip install petl  
pip install python-constraint
pip install bokeh  
Packages for Google Search pip install googlesearch-python
pip install beautifulsoup4
pip install google
pip install -q tensorflow pip install -q tensorflow==2.0.0-beta1
pip install gmaps Google maps
pip install geopandas  
pip install geopy Get the longitude and latitude of a city
pip install geocoder  
pip install cdata Connect to live data directly from Python
pip install mediapipe  
pip install pywinauto  
pip install turicreate  
pip install pyinstaller  
pip install spacy  
pip install gensim  
pip install PySimpleGUI27  
pip install auto-py-to-exe  
pip install keyboard  
pip install seleniumbase  
pip install psutil  
pip install google.colab (has errors)
pip install google-colab (has errors)
pip install periodictable  
pip install pyperclip3  
pip install pytesseract highlight a specific word in an input image
pip install opencv-contrib-python highlight a specific word in an input image
pip install xerox  
pip install pdf  
pip install Py2exe  
pip install PyGetWindow  
pip install sets  
pip install watchdog  
pip install hasher  
pip install matplotlib-scalebar  
pip install python-pptx  
pip install clipboard  
pip install pyperclip  
pip install epub-conversion  
pip install xml_cleaner  
pip3 install rake-nltk  
pip install scikit-learn-extra  
pip install missingno  
pip install plotly  
pip3 install keybert  
pip3 install wordcloud The installation is for Word Cloud application
pip3 install matplotlib
The installation is for Word Cloud application
pip install fpdf  
pip install aspose.slides For merging pptx slides, but it is useless because it has advertisements, which cover the slides
pip install scrapy  
pip install textract-plus import textractplus as tp
pip install scikit-learn import scikit-learn with sklearn (import sklearn as "..."), install scikit-learn with "pip install scikit-learn"
pip install tensorflow Command to verify your TensorFlow version from your Terminal: python -c 'import tensorflow as tf; print(tf.__version__)'
pip install tensorflow-gpu In case you want to use GPUs, the CUDA Toolkit as well as the NVIDIA cuDNN library need to be installed; then you can install TensorFlow with the GPU support
GraphLab Create

Graphlab Create is not free. pip install --upgrade --no-cache-dir https://get.graphlab.com/GraphLab-Create/2.1/your registered email address here/your product key here/GraphLab-Create-License.tar.gz (page4073).

pip install torchvision Install torchvision
pip install pytorch-pretrained-bert pytorch-nlp Install the pretrained version of Bert available in the pytorch-nlp package
import extract_msg For outlook: pip install extract-msg, pip install imapclient
pip install pyresume  
pip install tabpy To install Tableau Python module. Other ways of installing it are: Download the TabPy repository form https://github.com/tableau/TabPy, or clone the TabPy repository using git clone git://github.com/tableau/TabPy    

Installation of tesseract on windows:
        i) tesseract installer available at: https://github.com/UB-Mannheim/tesseract/wiki.
        ii) tesseract path: C:\Users\USER\AppData\Local\Tesseract-OCR, but it may change.
        iii) pip install pytesseract
        iv) Set the tesseract path in the script before calling image_to_string: pytesseract.pytesseract.tesseract_cmd = r"C:\Users\xyz\AppData\Local\Tesseract-OCR\tesseract.exe". (e.g. page4352)

Installation of Webdriver:
        i) Go to https://www.selenium.dev/downloads/
        ii) Go to Browsers
               Go to Browsers
        iii) Go to Chrome
        iv) Click "documentation" at "ChromeDriver is supported by the Chromium project, please refer to their documentation for any compatibility information"
        v) Click "Latest beta release: ChromeDriver 97.0.4692.36"
        vi) Click to install it, then get:
                Installation of Webdriver

For web automation, it is good if you install the SelectorsHub - XPath Plugin:
             It can be dowloaded at: chrome.google.com/webstore/.

        Possible error message when running the chromedriver.exe >> DeprecationWarning: executable_path has been deprecated, please pass in a Service object
        i) To Solve DeprecationWarning: executable_path has been deprecated, please pass in a Service object Error Here executable_path is deprecated you have to use an instance of the Service() class as follows. s = Service('C:/Users/…/chromedriver.exe') and then driver = webdriver.Chrome(service=s) Now, Your error must be solved.
        ii) To Solve DeprecationWarning: executable_path has been deprecated, please pass in a Service object Error Here executable_path is deprecated you have to use an instance of the Service() class as follows. s = Service('C:/Users/…/chromedriver.exe') and then driver = webdriver.Chrome(service=s) Now, Your error must be solved.
        The code will also be changed to:
             from selenium import webdriver
             from selenium.webdriver.chrome.service import Service # Adding
             ser = Service(r"C:\...\chromedriver.exe") # Adding
             op = webdriver.ChromeOptions() # Adding, but it can be removed
             s = webdriver.Chrome(service=ser, options=op) # Adding
             s = webdriver.Chrome(service=ser) # Adding, but removed "options=op"

Method to fix "ModuleNotFoundError: No Module Named '..'." Error (problem) in Python in the following steps:
         Step i) If a package cannot be installed successfully, then the pip has not been installed, then install "pip" with "pip install discord.py".
         Step ii) Use correct version of python to install the packages: e.g. pip3 install ... (if Python 3.x version was installed and is used), pip3.9 install ... or py -3.9 -m pip install ... (if Python 3.9 version was installed and is used).
         Step iii) Method two: Use "setx":
            iii.a) Check the environmental variable first:
               echo %PATH%
               setx PATH "%PATH%;
            iii.b) setx PATH "%PATH%; C:\Users\yyliao\AppData\Local\Programs\Python\Python39\Scripts\"
               setx PATH "%PATH%;
         Step v) Add path in a different way:
            vi.a) Go to "run" by searching on windows "Search" function:
             setx PATH "%PATH%;
             vi.b) Go to Advanced > Environment Variables:
             setx PATH "%PATH%;
            vi.c) Go to the two "Path":
             setx PATH "%PATH%;
            vi.d) Python is not in the box below, then need to type or paste the path (find it with "echo %PATH%" -- see ii.a)) there:
             setx PATH "%PATH%;
             setx PATH "%PATH%;                    

Troubleshooting of watchdog:
         i) Error message after running a script with watchdog: ModuleNotFoundError: No module named 'watchdog.observers'; 'watchdog' is not a package.
         ii) Origin A of the error cause: the file for monitoring a folder is named as watchdog.py.
         iii) Origin B of the error cause: a file with name of watchdog.py in the script folder for a monitoring folder.
         iv) Origin C of the error cause: In the path (C:\Anaconda3\Lib\site-packages) of Anaconda, there is no watchdog module; then copy the two modules (watchdog and watchdog-0.8.3-py3.6egg) in the path of C:\Python36\Lib\site-packages to C:\Anaconda3\Lib\site-packages.

Important Python Libraries for electrical engineering and electronics Engineers
ElectricPy SKiDL PySpice Pint NumPy Matplotlib Jupyter
Scipy Sympy Numdifftools        
Procedure to install electrical simulation programs:
         i) Install Python
         ii) Install PySpice
         iii) install NgSpice:
                  pyspice-post-installation --install-ngspice-dll
         iv) Check if the installation works or not:
                  pyspice-post-installation --check-install
If it works, the it will show: "PySpice should work as expected"
Important Python Libraries for Data Science: Here’s a list of interesting and important Python Libraries that will be helpful for all Data Scientists out there.
Pandas Numpy Matplotlib   Plotly Pydot Gensim
PyOD StatsModels Scipy BeautifulSoup XGBoost Seaborn Bokeh
Scikit-Learn Tensorflow Keras Pytorch Theano NLTK Scrapy
 
Important Python Libraries for Image Processing: In image processing, make sure your installations were successful by running Python, and doing:
         import cv2
         import matplotlib
         import numpy
Scikit-image OpenCV Mahotas SimplelTK SciPy Pillow Matplotlib
numpy            
Important Python Libraries for finance industry
Pandas NumPy SciPy Pyfolio Statsmodel Pynance Zipline
Quandl            
Python packages to support excel applications
Install setuptools:
     python -m pip install -U pip setuptools, or python3 -m pip install -U pip setuptools (for Windows)
     pip install -U pip setuptools, or pip3 install -U pip3 setuptools (for Linux/OS X)
pandas openpyxl xlrd xlutils pyexcel.    

Python packages to support webpage/html applications

wkhtmltopdf imgkit python-pip selenium chromedriver chromedriver_autoinstaller
          chrome-bookmarks

Python packages to support Machine Learning applications

sklearn.impute            
             

Check if a module/library is installed:
        import sys
        'geopandas' in sys.modules
        Output: False => Means "Not Installed".