How to Scale Native (C/C++) Applications on Pivotal’s MPP Platform: Edge Detection Example, Part 2

December 18, 2014 Srivatsan Ramanujam

Joint work performed by Gautam Muralidhar and Srivatsan Ramanujam

featured-scaling-trianglesIn part one of this blog series, we introduced the task of edge detection, an important problem in developing computer vision algorithms. In the previous post, we demonstrated that a sample native application can be seamlessly integrated and scaled up for data parallel problems on HAWQ, Pivotal’s SQL-on-Hadoop solution. We showed how a native application written in C++ can be scaled on Pivotal’s massively parallel processing (MPP) platform through PL/Python, which is a minimally intrusive approach to take.

In this second part, we will show how the same task can be achieved via the PL/C user defined function. If speed of execution is the most important criterion for your team, then it is well worth the effort of porting your native app to PL/C UDFs.

At a high level, running C++ native applications in HAWQ via PL/C involves the following steps:

  1. Adding Postgres interface functions to the native application, which will be called from the PL/C UDF in HAWQ.
  2. Compiling the native application as a shared object or dynamic library.
  3. Installing the shared object and the dependent dynamic libraries (e.g., OpenCV) on all HAWQ segment nodes.
  4. Creating a PL/C driver UDF in HAWQ, which invokes the native Postgres interface function created in step 1.
  5. Invoking the PL/C driver on the image table in HAWQ.

1) Adding Postgres interface functions to the native application

To invoke a native function in HAWQ, the function parameters and return arguments need to be received, and then passed between the database environment using Postgres C-language macros and functions. The code snippet in Fig. 11 illustrates the Postgres C-language interfaces for the two functions in our application:

    a) getImgSizeFromByteStream and
    b) edgeDetectionFromByteStream
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
/**
* Gautam Muralidhar,Srivatsan Ramanujam Oct 2014
* PL/C function for invoking Canny’s Edge Detection from OpenCV
**/
extern C {
#include <postgres.h>
#include <fmgr.h>
#include <utils/array.h>
#include <utils/builtins.h>
#include <catalog/pg_type.h>
#include <string>
#include <vector>
#include <stdlib.h>
#include <stdint.h>
#ifdef PG_MODULE_MAGIC
PG_MODULE_MAGIC;
#endif
// Postgres interface for getImgSizeFromByteStream
PG_FUNCTION_INFO_V1(get_imgsize);
Datum get_imgsize(PG_FUNCTION_ARGS){
if (PG_ARGISNULL(0)){
ereport(ERROR, (errmsg(Null arrays not accepted)));
}
// Collect the input image byte stream as C string
char* cstr = TextDatumGetCString(PG_GETARG_DATUM(0));
string str(cstr);
vector<string> tokens;
// Tokenize the string on “,” and collect the tokens, which represent individual bytes of the image
Tokenize(str, tokens, ,);
vector<int8_t> src;
for (int i = 0; i < tokens.size(); i++) {
const char* tk = tokens[i].c_str();
int bt = atoi(tk);
src.push_back(bt);
}
// Call the internal getImgSizeFromByteStream function
uint* imgSize = getImgSizeFromByteStream(src);
Datum* imgSizeArray = (Datum*)palloc(sizeof(Datum) * 2);
for (int i = 0; i < 2; i++) {
imgSizeArray[i] = imgSize[i];
}
// Construct the array to be returned back to the database client
ArrayType *res = construct_array(imgSizeArray, 2, INT4OID, 4, true, i);
PG_RETURN_ARRAYTYPE_P(res);
}
// Postgres interface for edgeDetectionFromByteStream
PG_FUNCTION_INFO_V1(canny_plc);
Datum canny_plc(PG_FUNCTION_ARGS){
if (PG_ARGISNULL(0)){
ereport(ERROR, (errmsg(Null arrays not accepted)));
}
// Collect the input image byte stream as C string
char* cstr = TextDatumGetCString(PG_GETARG_DATUM(0));
string str(cstr);
vector<string> tokens;
// Tokenize the string on “,” and collect the tokens, which represent individual bytes of the image
Tokenize(str, tokens, ,);
vector<int8_t> src;
for (int i = 0; i < tokens.size(); i++) {
const char* tk = tokens[i].c_str();
int bt = atoi(tk);
src.push_back(bt);
}
// Call the internal getImgSizeFromByteStream function
uint* imgSize = getImgSizeFromByteStream(src);
uint* edge_result = new uint[imgSize[0]*imgSize[1]];
// Call the internal edgeDetectionFromByteStream function
bool success = edgeDetectionFromByteStream(src, edge_result);
if (success) {
Datum* imgSizeArray = (Datum*)palloc(sizeof(Datum) * imgSize[0]*imgSize[1]);
for (int i = 0; i < imgSize[0]*imgSize[1]; i++) {
imgSizeArray[i] = edge_result[i];
}
// Construct the array to be returned back to the database client
ArrayType *res = construct_array(imgSizeArray,
imgSize[0]*imgSize[1], INT4OID, 4, true, i);
PG_RETURN_ARRAYTYPE_P(res);
} else {
ereport(ERROR, (errmsg(Edge detection failed)));
}
}
}

Figure 11: Postgres C-interfaces for the getImgSizeFromByteStream and edgeDetectionFromByteStream Functions

As illustrated in Figure 11, the Postgres C declaration of a function is always Datum function_name(PG_FUNCTION_ARGS). In addition, for functions that are dynamically loaded, the macro PG_FUNCTION_INFO_V1(func_name) has to precede the function declaration. The magic block PG_MODULE_MAGIC guards against the loading of the dynamic library into an incompatible database server. The macros require that users include the header file fmgr.h.

Also, as illustrated in Fig. 11, the input function parameters are checked for null using the Postgres macro PG_ARGISNULL and the parameter values are retrieved using the macros PG_GETARG_DATUM and TextDatumGetCstring. The array return arguments of the functions are constructed using the Postgres C built-in function, construct_array, and returned using the macro PG_RETURN_ARRAYTYPE_P. Finally, the internal functions getImgSizeFromByteStream, edgeDetectionFromByteStream, and Tokensize are your regular C++ functions as shown in Figure 12.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
/**
* Gautam Muralidhar, Nov 2014
* Internal C++ functions for Canny Edge Detection.
*
**/
#include <iostream>
#include <string>
#include <vector>
#include opencv2/opencv.hpp
#include opencv2/imgproc/imgproc.hpp
#include <stdint.h>
using namespace std;
void Tokenize(
const string& str,
vector<string>& tokens,
const string& delimiters = ) {
// Skip delimiters at beginning.
string::size_type lastPos = str.find_first_not_of(delimiters, 0);
// Find first “non-delimiter”.
string::size_type pos = str.find_first_of(delimiters, lastPos);
while (string::npos != pos || string::npos != lastPos)
{
// Found a token, add it to the vector.
tokens.push_back(str.substr(lastPos, pos – lastPos));
// Skip delimiters. Note the “not_of”
lastPos = str.find_first_not_of(delimiters, pos);
// Find next “non-delimiter”
pos = str.find_first_of(delimiters, lastPos);
}
}
bool edgeDetectionFromByteStream(vector<int8_t> src, uint* result){
cv::Mat srcImg, srcGray, dstImg, onesImg, edges;
srcImg = cv::imdecode(src, CV_LOAD_IMAGE_COLOR);
if(srcImg.data ) {
///Create a matrix of the same type and size as src (for dst)
dstImg.create(srcImg.size(), CV_8UC1 );
onesImg.create(srcImg.size(), CV_8UC1 );
/// Convert the image to grayscale
cv::cvtColor(srcImg, srcGray, cv::COLOR_BGR2GRAY);
/// Blur the image first
cv::blur(srcGray, edges, cv::Size(3,3));
/// Call Canny’s edge detect
cv::Canny(edges, edges, 10, 30, 3 );
dstImg = cv::Scalar::all(0);
onesImg = cv::Scalar::all(1);
onesImg.copyTo(dstImg,edges);
for (int i = 0; i < dstImg.rows; i++){
for (int j = 0; j < dstImg.cols; j++){
result[(edges.cols)*i+j] = uint(dstImg.at<unsigned char>(i,j));
}
}
return true;
} else {
return false;
}
}
uint* getImgSizeFromByteStream(vector<int8_t> src){
cv::Mat srcImg;
srcImg = cv::imdecode(src, CV_LOAD_IMAGE_COLOR);
if (srcImg.data) {
uint* result = new uint[2];
result[0] = srcImg.rows;
result[1] = srcImg.cols;
return result;
} else {
uint* result = new uint[2];
result[0] = –1;
result[1] = –1;
return result;
}
}

Figure 12: Internal C++ functions of the Canny Edge Detection application

2) Compiling the Native Application as a Shared Object

As was shown with the PL/Python example, it is possible to invoke native applications in HAWQ. To do this, the user must first compile the application as a dynamic library (.so file). For PL/C, the application needs to be compiled using the correct Postgres headers, which might require the inclusion of the header files path. For example, on a CentOS system, the command illustrated in Figure 13 can be used to compile the Canny edge detection application, which we discussed in the previous post, as a shared object for PL/C.

1
g++ -shared -Wl,-soname,canny_plc -fPIC -I/usr/local/hawq-1.2.0.0/include/postgresql/server/ -I/usr/local/hawq-1.2.0.0/include/postgresql/internal -I/usr/local/hawq-1.2.0.0/include/ -o canny_plc.so -lopencv_core -lopencv_imgproc -lopencv_highgui CannyPLC.cpp

Figure 13: Building a C++ application into a shared object for PL/C

The inputs to the gpscp command comprise the hostfile parameter—a file containing the host names of segments nodes in HAWQ (in our case the segments are named hdw1 to hdw16). The file also includes the filenames of the shared objects to copy and the path of the destination directory on the segment nodes (in our case, /usr/local/lib/ds).

3) Installing the Dynamic Library and Its Dependencies on all HAWQ Segment Nodes

Once the shared object has been built, the next step is to install the shared object and the dependent libraries on all HAWQ segment nodes. This is achieved via the gpscp command as illustrated in Figure 14.

1
gpscp -f hostfile canny_plc.so libopencv_core.so.2.4 libopencv_imgproc.so.2.4 libopencv_highgui.so.2.4 =:/usr/local/lib/ds

Figure 14: Installing the shared object and dependent libraries on all HAWQ segment nodes

The inputs to the gpscp command are the hostfile parameter—a file containing the host names of segments nodes in HAWQ (in our case the segments are named hdw1 to hdw16), the filenames of the shared objects to copy and the path of the destination directory on the segment nodes (in our case /usr/local/lib/ds).

Now that our dynamic libraries have been distributed to all segment nodes, ensure that the environment variable LD_LIBRARY_PATH is updated to include the directory where we copied our files do and restart HAWQ. This can be achieved by adding the following to ~/.bashrc on all segment nodes

1
2
3
gpssh -f hostfile
echoexport LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib/ds” >> ~/.bashrc
gpstop -r

4) Creating a PL/C Driver UDF in HAWQ, Which Invokes the Native Postgres Interface Function Created in Step 1

Once the shared object and the dependent libraries have been distributed on HAWQ segment nodes, our C++ application can now be invoked in HAWQ via the PL/C UDFs. The code snippet in Figure 15 illustrates the PL/C UDFs.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
/**
* Gautam Muralidhar, Srivatsan Ramanujam, Nov 2014
* PL/C UDFs to invoke Edge Detection
*/
create or replace function CannyEdgeDetectPLC(varchar)
returns int[]
as
/usr/local/lib/ds/canny_plc.so, canny_plc
language C strict immutable;
create or replace function GetImgSizePLC(varchar)
returns int[]
as
/usr/local/lib/ds/canny_plc.so, get_imgsize
language C strict immutable;

Figure 15: PL/C UDFs for Invoking the C++ Application

As illustrated in Figure 15, the PL/C UDF CannyEdgeDetectPLC takes in as input an image whose byte stream is encoded as a string (varchar). Our UDF invokes the Postgres interface function canny_plc in the C++ shared object canny_plc.so. This UDF returns an array comprising of the image edge map. Similarly, the PL/C UDF GetImgSizePLC takes as input an image whose byte stream is encoded as a string (varchar), invokes the Postgres interface function get_imgsize, and returns an array comprised of two elements of the image size: the number of image rows, and the number of image columns.

5) Invoking the PL/C UDFs on the Image Table in HAWQ

The PL/C UDFs are invoked as normal SQL functions, as illustrated in Figure 16. The return arguments of the PL/C functions (arrays) are stored in HAWQ as individual columns in the table canny_edge_table_plc.

1
2
3
4
5
6
7
create table ocv.canny_edge_table_plc
as
(
select GetImgSizePLC(img) as imgsize,
CannyEdgeDetectPLC(img) as edges
from ocv.src_image
) distributed randomly;

Figure 16: Invoking the PL/C UDFs

Finally, the results of our edge detection application can be checked via the convenient Pandas-via-psql command line tool, which was developed here at Pivotal. If you have Anaconda Python, you can simply run the following command to install this visualization utility.

1
pip install ppsqlviz

Figures 17 and 18 illustrate the pandas-via-psql command and the result of our edge detection application on a sample image.

1
psql -d <dbname> -h <HAWQ master hostname> -U <username> -c select nrows, ncols, edges from ocv.canny_edge_table_plc limit 1; | python -m ‘ppsqlviz.plotter’ image

Figure 17: Displaying the result using pandas-via-psql utility image00

Figure 18: Example edge detection result image01

With the image edges now available in HAWQ, the user can similarly proceed in HAWQ with other steps of a computer vision workflow, aka an object recognition workflow, such as feature computation and machine learning.

Pros and Cons: PL/C UDFs

In comparison to the PL/Python UDFs approach presented in the previous blog post, the PL/C UDF encompasses the use of Postgres C-language macros and functions for handling function arguments and data. This approach is a bit intrusive, in that we need to maintain a separate wrapper over the native app. Furthermore, the code is more complex than Python. Depending on the comfort level of the team with Postgres C-language functions, it might take some time to add the new code and re-build the shared objects, but it is definitely worth the effort if the speed of execution is your most important criterion. For example, when we used our PL/C function to run edge detection on the same 907 images on the same HAWQ cluster of 16 nodes, we got the results in 7 seconds.

To summarize, in this two-part blog post series we demonstrated how a sample native application was seamlessly integrated and scaled up for data parallel problems on HAWQ. By using the Canny Edge Detection problem as an example, we presented two approaches to scaling the native app. The less intrusive and faster to implement of the two examples was PL/Python UDFs and Ctypes. PL/C is more a performant, though laborious approach, that offers its own set of advantages. We believe the right choice depends on your problem, your team’s expertise and what is more valuable—engineer’s time or system time. In the end, that’s a decision best addressed by you, our readers!

About the Author

Biography

More Content by Srivatsan Ramanujam
Previous
10 Amazing Things to Do With a Hadoop-Based Data Lake
10 Amazing Things to Do With a Hadoop-Based Data Lake

In this post, Pivotal's product marketing director, Greg Chase, covers a spectrum of capabilities and scena...

Next
Distributed Deep Learning on MPP and Hadoop
Distributed Deep Learning on MPP and Hadoop

Deep learning has become a more popular approach to machine learning that has shown to provide significant ...

Enter curious. Exit smarter.

Learn More