Question: Image Binarization In computer vision, image binarization , a.k.a. thresholding is the process of taking a grayscale image and converting it into a black and

Image Binarization

In computer vision, image binarization, a.k.a. thresholding is the process of taking a grayscale image and converting it into a black and white image. In grayscale images, every pixel represents an intensity value ranging from 0 (black) to 255 (white). In black and white images, every pixel is either 0 or 255. Intensity refers to the brightness of a color, white is the brightest and therefore the most intense, black is the darkest and the least intense. The figure below shows an example of image binarization:

Global Thresholding

The process of image binarization can be done by comparing each pixel within the input image against a predefined threshold T, and deciding whether each individual pixel gets turned white or black . Pixels whose intensity is less than T become black, all others become white. The algorithm below shows how to binarize an image.

input: Image A (grayscale) output: Image B (black and white) calculate global T for each pixel A[i][j] in A do if A[i][j] < T then B[i][j] = 0 else B[i][j] = 255 endif endfor 

The value of T can be automatically calculated by using a function. For example, T can be the average intensity, or the median of all pixels. There are hundreds of different ways for calculating T, often involving statistical measures. As you can imagine, the choice of T has strong implications on the quality of the final image, which is also dependent on the complexity of the image. Some methods will work well only with certain types of images.

For the purpose of this assignment, a global value of T must be calculated using the median value of all pixels in the input image

Local Thresholding

While global thresholding uses a constant threshold T for transforming each pixel in an image, the idea behind local thresholding is to transform pixels by only considering the surrounding area, i.e., the local neighborhood of each pixel. The neighborhood of a pixel p is a small matrix of dimensions d x d centered at p. In this case, for each pixel in the image, we are basically calculating a different T. The algorithm below shows how to binarize an image using local thresholding.

input: Image A (grayscale) output: Image B (black and white) for each pixel A[i][j] in A do calculate T[i][j] using local neighborhood if A[i][j] < T[i][j] then B[i][j] = 0 else B[i][j] = 255 endif endfor 

As an illustration, the figure below shows an example of applying local thresholding. The example uses the median as the metric for deciding the new value of a pixel.

Note that when calculating the neighborhood of pixels at the edges, where the neighborhood is not a perfect fit, we ignore all pixels that fall outside the boundaries of the image. In the illustration below, this is the case of the pixel at position [0,0] highlighted in orange.

Your Task

Your goal in this assignment is to develop a command line tool that performs image binarization, given some options provided by the user.

Command Line Arguments

Your program must accept the following command line arguments:

 either 'local' or 'global'  name of the input file  name of the output file [] size of the neighborhood 

The last argument is optional, and must be provided only when is 'local'. For example, see below a few examples of how to use your tool. Note that the correct order of command line arguments is very important.

$ ./binarizer global cover.img cover_glo.img $ ./binarizer local cover.img cover_loc_5.img 5 $ ./binarizer local cover.img cover_loc_7.img 7 $ ./binarizer local cover.img cover_loc_15.img 15 
  • The value of T for global thresholding must be the median of all pixels
  • The value of T[i,j] for local thresholding must be adib, given by the formula below. This formula is from the paper Adaptive document image binarization by Sauvola and PietikaKinen, 2000.
  • Note: If you are using std::minmax this will only work on a 1D array/vector! If you are using a 2D array/vector, you'll need to flatten it first or write your own function.

Image file format

Each image is encoded as a matrix of pixel values where each pixel value is a grayscale intensity, basically an integer ranging from 0 to 255. Lets call this format as the img format and use the extension .img for such files. Within your program, you can represent an image either as a bidimensional array, or as an unidimensional array and design your algorithms properly. When loading or saving images, each img file must be a text file where pixels values are separated by a single whitespace, and organized in n rows and m columns (the image dimensions). For example, here is one file with 10 rows and 8 columns. We prepared a few conversion scripts that can help you test your program with real-world examples, please refer to this document.

121 24 149 1 173 251 10 38 97 137 153 92 40 93 9 149 138 136 128 18 66 109 16 138 185 218 67 3 194 155 186 255 131 50 188 128 120 193 104 144 39 109 228 155 131 42 133 93 75 148 197 137 26 198 0 226 43 85 167 158 28 207 17 165 14 150 49 205 79 86 216 8 88 78 159 41 66 227 84 80 

Note that every pixel value is separated by a single whitespace. There should not be any trailing whitespaces.

Submission and Grading

Your submission will be tested and graded by an autograder, for this reason it cannot be stressed enough that your program must exactly follow the specifications for input and output upon submission. Please use main.cpp as the single name for your program. We expect you to design your own functions and classes to model this problem, and all should be within the same file. Your program will be compiled with the following line:

$ g++ -std=c++11 -Wall main.cpp -o prog

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!