Skip to content

This project comes from class "digital image processing", and realizes the basic program framework construction. The curve data point extraction function was added later, and the file format was expanded, and many common additional functions were added.

License

Notifications You must be signed in to change notification settings

sunieee/DigitalImage

Repository files navigation

Digital Image Processing Program Framework

[TOC]

This project comes from class "digital image processing", and realizes the basic program framework construction. The curve data point extraction function was added later, and the file format was expanded, and many common additional functions were added. Function list:

  • basic function
    • Basic image processing program framework
    • Supports image file formats, including*.png *.jpg *.bmp *.raw *.data
    • Image Fourier transform and filtering
    • Representation of Image Fourier Descriptor
    • edge detection
    • Add color high-pass/low-pass filtering
  • Extract Curve Data Points
    • box cropping
    • Separate by color
    • order selection
    • fit or interpolate
  • UI design improvements
    • Extended file format, support pdf format, automatic background pdf conversion image
    • Optimize file cache, resolve possible conflicts, cache in Roaming/digitalImagedirectory
    • The basic functions are compatible with the pdf format (the image reconstruction of Fourier transform and Fourier descriptor is not currently supported)
    • Optimize the interface design, add switching, closing, input and output moving buttons
  • extensions
    • Image to PDF function
    • Image/PDF to GIF function
    • Remove watermark function
    • Composite large image function
    • split horizon function

program description

Directory Structure

  • main.py: The main entry of the program, which can be started by python main.pystarting the PyQt5 interface
  • main.ui: interface design using Qt Designer
  • Ui_main.py: The interface code generated by compiling the .ui file is called by the main program
  • util.py: Various tool functions, including histogram, Fourier transform, various edge detection algorithms, etc.
  • LineExtractor: Class functions related to extracting curve data points
    • extractor.py: Select points sequentially and perform fitting or interpolation
    • seperator.py: separate by color
    • tailor.py: Box cropping
  • dist: The folder contains various packaged libraries and dependencies, as well as executable files (the file is not uploaded because it is too large, it is still in the development stage)
    • To run the UI program directly, please clickdist/main/main.exe
    • Packaging command:pyinstaller -D -i test.ico main.py --noconfirm
  • doc: related documents

Dependencies need to be installed before using python main.py:pip install -r requirements.txt

program interface

After the program is opened, as shown in the figure below, it includes two panes of input and output, both of which have a fixed size of 700*700. The window can be zoomed in and out, but the size of the pane is fixed.

image-20220710222933020

The menu bar contains 5 main menus, and each menu has some specific functions, which correspond to all the requirements of the following large tasks one by one, and some functions have corresponding shortcut keys.

  • Files: Open, close and export images (and PDFs), which will be displayed in the input pane after opening
  • Basic operation: use the activated picture in the input pane as input, perform basic image operations, and display the picture in the output pane
  • FFT: Use the activated picture in the input pane as input, automatically convert it to a grayscale image, perform FFT-related operations, and display the picture in the output pane
    • The "Fourier Transform" function outputs a total of 3 images of FFT, enhancement, and inverse transformation
  • Edge detection: Use the activated picture in the input pane as input, automatically convert it to a grayscale image, perform edge detection related operations, and display the picture in the output pane
  • Extraction: Steps related to extracting curve data points, which does not yet support the "whole process of extracting curve data points"
  • Practical tool: mainly used for the PDF of the input box, if the current input is not PDF, it will be used for all input images, and output a PDF

image-20220710223520646

image-20220710230459430

When running the program:

  • The console output is the output of the program running. If the execution is abnormal, the program will exit. See the console output for the exit reason.
  • The opened image will be displayed compressed, that is, the larger of the length and width is set to 700, and the other dimension is scaled proportionally. However, the actual size of the picture remains unchanged, and is displayed in the lower right corner through numbers, and the name of the picture will be displayed at the same time. If it is a pdf, the page number of the pdf will also be displayed
  • There are multiple tabs for input and output, and you can switch between panes. To close the pane, click Menu>File>Close, or click the ×button at the bottom of the window
  • The content currently displayed in each pane is the active tab
  • <Through and at the bottom of the pane >, you can switch the active tab. If it is a pdf, it will only switch the page of the pdf without switching the tab

basic function

Image Processing Program Framework

  • Design the framework of digital image processing program based on VC-based multiple document interface (MDI)
  • Program in the software to realize the reading and display of BMP format image files
  • Choose to realize the reading and display of JPG and RAW format files, as well as the conversion with BMP format
  • Complete the basic operations of images: addition, negation, geometric transformation
  • Complete the histogram equalization of the image

file reading and writing

There are various types of file formats, and four types of image formats and pdf formats are supported for opening and closing. In this way, the conversion between various image formats is realized, and the conversion from pdf to image is realized at the same time.

image-20220710230757818

Basic operations on images

image-20220630230720121

Addition will use all open input images (or current pdf) for addition, and finally use the maximum width and height of all images to be the size of the latest image.

Note : Please do not use images with inconsistent channels to add points, it will cause the program to crash (can be converted to grayscale images in advance)

image-20220630230834085

The effect of zooming and rendering will not change, but the actual size has changed, see the size in the upper right corner.

Histogram equalization

Colors are visibly more vibrant:

image-20220628191431704

Image Fourier transform and filtering

  • Realize the FFT transformation and display of the image
  • Implement FFT inverse transformation

Observe the spectrogram of a typical image after FFT transformation

  • First construct a black and white binary test image, for example: a 4×4 white square is generated in the center of a 128×128 black background. Then perform the following tests in sequence.

    • DFT

      image-20220627215141715
    • pan, zoom

      image-20220627215247881

FFT

Through the menu bar>FFT>Fourier transform, output 3 pictures during the execution process:

  • Spectrogram in the complex domain (forcibly converting uint8 makes the picture distorted)
  • 2DFT map of dynamic range compression (map the point with the highest value to brightness 255)
  • Output map of FFT inverse transformation

image-20220630231153611

High-pass/low-pass filtering

Support custom filter radius, select by interactive input, the results of high-pass and low-pass are as follows:

image-20220630232002932

Representation of Fourier descriptors

For the boundary on the XY plane in Figure 1, it is represented by a Fourier descriptor and reconstructed with different numbers of items

image-20220627215629337

The Fourier descriptor is an image feature used to describe the characteristic parameters of the contour.

The basic idea of the Fourier descriptor is: first, we set the shape profile of the object to be a closed curve, and a point moves along the boundary curve, assuming that this point is p(l), and its complex coordinate is x(l )+jy(l), its period is the perimeter of this closed curve, which also shows that it belongs to a periodic function. This function whose period is the circumference of the curve can be represented by a Fourier series . Multiple coefficients z(k) in the Fourier series have a direct relationship with the shape of the closed boundary curve, which is defined as a Fourier descriptor . When the coefficient term z(k) of sufficient order is taken, the Fourier descriptor can fully extract the shape information and restore the shape of the object. That is to say, the Fourier descriptor uses a vector to represent the contour and digitize the contour, so as to better distinguish different contours and achieve the purpose of recognizing objects. The Fourier descriptor is simple and very efficient, and it is one of the important methods to recognize the shape of an object.

To put it simply, the Fourier descriptor is to use a vector to represent a contour, digitize the contour, so as to better distinguish different contours, and then achieve the purpose of recognizing objects.

As shown in the figure above, a small number of Fourier descriptors can be used to capture the general characteristics of the boundary. This property is useful because these coefficients carry shape information.

The whole process is as follows:

  1. Edge detection: use the edge detection algorithm to extract the edge, and perform a closing operation to make the edge clearer and remove small black spots
  2. Select the largest contour among all contours, that is, the target contour, and draw it
  3. Calculate the Fourier descriptor of the contour, and output the first 32 descriptors through the pane
  4. The number of descriptor items can be selected for reconstruction.

image-20220630232156686

edge detection

edge detection

  • Program to realize image edge extraction based on typical differential operators (not less than Roberts, Sobel, Prewitt, Laplacian), able to read the content of image files, and output edge detection results after detection
  • Analyze and compare the characteristics of different operators

image-20220630233332598

image-20220630233039433

image-20220630233227582

operator Comparison of advantages and disadvantages
Roberts The effect of image processing with steep low noise is better, but the result of using the Roberts operator to extract the edge is that the edge is relatively thick, so the edge positioning is not very accurate
Sobel The image processing effect is better for grayscale gradients and images with more noise, and the Sobel operator is more accurate for edge positioning
Scharr The difference from the Sobel operator is in the smoothing part. The smoothing operator used here is 1/16 [3, 10, 3], compared to 1/4 [1, 2, 1], the central element accounts for more weight Heavy. Assuming that the image is a signal with strong randomness, the domain correlation is not large
Prewitt It has a better effect on image processing with grayscale gradients and more noise
Laplacian Accurate positioning of step edge points in the image is very sensitive to noise, and part of the direction information of the edge is lost, resulting in some discontinuous detection edges.
Log The LG operator often has double-edge pixel boundaries, and the detection method is sensitive to noise, so the LG operator is rarely used to detect edges, but to judge whether the edge pixels are located in the bright or dark areas of the image.
Canny This method is not susceptible to noise and can detect truly weak edges. In the edge function, the most effective edge detection method is the Canny method. The advantage of this method is that two different thresholds are used to detect strong and weak edges respectively, and weak edges are included in the output image only when weak edges are connected to strong edges. Therefore, this method is not easy to be "filled" by noise, and it is easier to detect real weak edges

Extract Curve Data Points

Example of program running:

动画

need

Input: images in .jpg/.png format of thermogravimetric tests of materials collected from literature

Output: data points corresponding to the curve

The image types involved can be mainly divided into the following three categories:

  1. Simple lines

image-20220611175527031

  1. There are interfering lines (ie blue lines)

image-20220611175541345

  1. Complex lines (there are multiple target curves)

image-20220611175603750

problem analysis

image-20220611181144776

In the above example, all the input data have obvious characteristics, so it is easier to extract than ordinary curves. Input features include:

  1. Most of the curves are surrounded by a black box , the abscissa is below the box, and the ordinate is on the left
  2. If there are multiple curves in a graph, the colors of the curves are mostly inconsistent
  3. The curves are mostly smooth curves, continuous from the far left to the right of the box, and monotonically decreasing
  4. The shape of the curve can be solid or dashed
  5. The curve mark is generally at the upper right of the line
  6. The background of the picture is all white

Extraction process:

  1. First find the position of the box or coordinate axis, and cut out the content to identify
  2. Different lines are distinguished by color, and as many lines as there are colors are output, arranged in descending order of the number of contained points. Ignore the case where two lines have the same color
  3. Select points from left to right on the line, always select the first point from top to bottom each time, but need to meet the requirements of monotonous decline
  4. Restore the original function by interpolation or fitting, and output 100 equidistant interpolation points

box cropping

See tailor.py, the core idea is to find the x/y coordinate axis:

  1. Determine horizontal and vertical lines based on the size of the horizontal/vertical pixel average
  2. In these lines, the coordinate axis generally appears at the leftmost or bottommost of the picture
  3. Because there must be a coordinate scale on the other side of the coordinate axis, exclude the horizontal and vertical lines that are completely white on the left/bottom
  4. Determine the cropping box according to the coordinate axis and crop

image-20220709213315230

Separate lines by color

Since the lines have their own colors, and the differences between the lines are large, different lines can be distinguished according to the colors. There are the following metrics to distinguish the difference of the lines:

  • Hue, after converting to hsv space, the hue is a different value between 0 and 1
  • Brightness, and the grayscale of the pixel
  • Color space distance, that is, the sum of the absolute deviations of RGB

Note: In order to facilitate the generation of a new image, the image is reversed before the separation is performed. The gray value below 50 is considered black, and the gray value above 235 is considered white.

round number of colors color band
0 215 image-20220707092403611
1 1647 image-20220707092544177
2 6986 ——
3 9998 ——

image-20220709213412566

order selection

Select points from left to right on the line, always select the first point from top to bottom each time, but need to meet the requirements of monotonous decline. In this way, the selection of points can completely avoid the situation of selecting labels, and because the monotonous decline is limited, it is impossible for the jitter to be too large.

image-20220709213600083

fit or interpolate

fit

In the process of fitting according to the sample points, the curve does not necessarily pass through the sample points, but the difficulty lies in the definition of the function form, and it is difficult to find a function fitting curve that meets the conditions. The figure below is the result of fitting from -3 to +3 items using a polynomial function, and the seven coefficients are:

[-5.81978492e+01  2.23481625e+02 -1.84351082e+02  1.34665445e+02
 -2.35943787e+00  1.71681831e-02 -1.61449275e-05]

It can be seen that this method does not represent the curve well.

image-20220709212917925

interpolation

The result obtained by interpolation is much easier and more accurate than finding a fitting function, but there may be jitter or unsmoothness, as in the four places circled in the figure, the interpolated curve will also jitter.

image-20220707230356099

The solution to the jittering lines:

  1. Where the fitting curve jitters, such as the absolute value of the high-order reciprocal is large, delete the sample points in the neighborhood and re-interpolate. Threshold needs manual adjustment
  2. Use the Fourier descriptor to reconstruct the contour, and smooth the contour by adjusting the number of M items

This step can be further optimized

extensions

Image to PDF function

Image/PDF to GIF function

Remove watermark function

There are different forms of watermarks. The common forms of watermarks on the market are:

  • Pure gray watermark, the three values of RGB are the same, which is obviously lighter (lower gray value) than other colors
  • Colored watermark, the hue is obviously inconsistent with other colors

Composite large image function

split horizon function

It is hoped that a sheet of music will be separated by group and divided into pictures.

Add the horizontal pixels of the picture, and you can see that there are obvious faults between each line of the score: 12 lines in total and 6 groups correspond to the 12 peaks in the right picture, and there are 5 sub-peaks in each peak, namely The five lines of the stave

image-20220710115129364

Take all grayscale values >thresh as rows containing content, and aggregate adjacent rows together. If the distance is >distance, they will be separated as two separate pieces. When (thresh=80, distance=30), the output is as follows:

{0: [171, 177, 182, 183, 188, 194], 1: [230, 231, 236, 242, 247, 248, 253], 2: [305, 327, 328, 333, 334, 339, 345, 350, 351], 3: [387, 388, 393, 398, 399, 404, 410, 411], 4: [484, 485, 490, 495, 496, 501, 502, 507], 5: [544, 549, 550, 555, 561, 566, 567], 6: [624, 625, 641, 646, 647, 652, 653, 658, 663, 664], 7: [700, 701, 706, 712, 717, 718, 723], 8: [797, 798, 803, 804, 809, 814, 815, 820, 821], 9: [861, 866, 867, 872, 878, 883, 884], 10: [958, 963, 964, 969, 970, 975, 980, 981], 11: [1017, 1018, 1023, 1029, 1034, 1035, 1040]})

Visualization:

image-20220710121526274

This method successfully finds all valid stave sequences. When segmenting, it can be segmented in the center of the gap. In particular, for the first group and the last group, take the mean value of all intervals.

error log

When the program starts:recursion is detected during loading of cv2 binary extensions. check opencv installation

Because the version of opencv is wrong: pip install opencv-python==4.5.3.56

Image rendering failed:

blocked by CORS policy: The request client is not a secure context and the resource is in more-private address space local.

About

This project comes from class "digital image processing", and realizes the basic program framework construction. The curve data point extraction function was added later, and the file format was expanded, and many common additional functions were added.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •