DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • Non-blocking Database Migrations
  • Keep Your Application Secrets Secret
  • CockroachDB TIL: Volume 11
  • Deploy a Hyperledger Fabric v2 Web App Using the Node.js SDK

Trending

  • The Modern Data Stack Is Overrated — Here’s What Works
  • What Is Plagiarism? How to Avoid It and Cite Sources
  • How AI Agents Are Transforming Enterprise Automation Architecture
  • Understanding IEEE 802.11(Wi-Fi) Encryption and Authentication: Write Your Own Custom Packet Sniffer
  1. DZone
  2. Data Engineering
  3. Databases
  4. Turbocharge Your Application Development Using WebAssembly With SingleStoreDB

Turbocharge Your Application Development Using WebAssembly With SingleStoreDB

In this article, learn how to build a Wasm UDF to perform sentiment analysis on data already stored in SingleStoreDB.

By 
Akmal Chaudhri user avatar
Akmal Chaudhri
DZone Core CORE ·
Oct. 07, 22 · Tutorial
Likes (1)
Comment
Save
Tweet
Share
5.7K Views

Join the DZone community and get the full member experience.

Join For Free

WebAssembly (Wasm) is a binary instruction format for a stack-based virtual machine. Wasm enables developers to use existing code from programming languages such as C, C++, and Rust as part of their application development process. However, Wasm is not just for the web, and today it is moving in exciting new directions. For example, one use-case would be to run Wasm code in a database system on the data already stored in the database system: an example of co-locating computation with data. Using Wasm to extend the capabilities of a database system opens up opportunities to develop many new applications. SingleStoreDB supports Wasm through Code Engine and, in this article, we'll see how to build a Wasm UDF to perform sentiment analysis on data already stored in SingleStoreDB.

Introduction

We'll need to perform a few steps to prepare our development environment, and the following sections will show how to do this. We'll also use Rust to create our Wasm UDF. Our code example will build upon the example described in the SingleStore blog post [r]evolution Summer 2022: Bring Application Logic to Your Data With SingleStoreDB Code Engine for Wasm.

Create a SingleStoreDB Cloud Account

To use Wasm on SingleStoreDB Cloud, we'll first create a free account from the portal. After we have logged in to the portal, we need to check that we can see the option to Create a new workspace group in the left-hand navigation pane, as shown in Figure 1.

Create a new workspace group

Figure 1: Create a new workspace group

In the center of the webpage, we should also see the option to Create a Workspace, as shown in Figure 2.

Create Workspace

Figure 2: Create Workspace

We'll click the Create Workspace button. On the next page, we'll be presented with five options:

  1. Workspace Group Name: We'll call our group Wasm Demo Group.
  2. Cloud Provider: We have the option to use AWS, GCP, or Azure. In this article, we'll use AWS.
  3. Region: A drop-down menu provides a list of supported regions. We can choose a region or use the default.
  4. Firewall Setting: For initial setup and testing, we'll allow Access from anywhere. We can change this later.
  5. Set Password: We'll create a password using the Generate Strong Password button and save the password in a safe place.

At the bottom of the webpage, we'll click Next.

On the next page, we'll be presented with three options:

  1. Workspace Details: We'll call our workspace wasm-demo.
  2. Size: S-00 will be fine for initial testing.
  3. Confirm Workspace Group: In the drop-down menu, we should check that the Wasm Demo Group is selected.

At the bottom of the page, we'll click Create Workspace.

After a short time, the Workspace Group should be successfully created and available.

Under the Overview tab (Figure 3):

Select Overview tab

Figure 3: Overview

If we scroll down, we'll see our wasm-demo workspace, as shown in Figure 4.

Wasm-demo: Connect Directly

Figure 4: wasm-demo workspace

Using the Connect pulldown, we'll select Connect Directly to obtain the information we need to connect to SingleStoreDB Cloud using various clients, such as a MySQL CLI client. The CLI command should be similar to the following:

Shell
 
mysql -u admin \
      -h <TO DO> \
      -P 3306 \
      --default-auth=mysql_native_password \
      -p


We need a minor variation of this command so that we can also use local files, if necessary, as follows:

Shell
 
mysql --local-infile \
      -u admin \
      -h <TO DO> \
      -P 3306 \
      --default-auth=mysql_native_password \
      -p


The <TO DO> for host (-h) is being replaced with the value obtained from SingleStoreDB Cloud.

Setup Local Wasm Development Environment

We can quickly create a local Wasm development environment using a few steps. However, SingleStore also provides a Docker Container as an alternative. We'll discuss this alternative later in this article.

Install the Software

First, we'll download the wasi-sdk. We'll use wasi-sdk-16.0-linux.tar.gz, the latest version available when writing this article. We'll move the file to the /opt directory and unpack it as follows:

Shell
 
cd /opt

sudo cp /path/to/wasi-sdk-16.0-linux.tar.gz .

sudo tar xzvf wasi-sdk-16.0-linux.tar.gz


We'll replace /path/to/ with the actual path where we downloaded the file. We'll also need to ensure that we add the bin directory to our PATH variable, as follows:

Shell
 
export PATH=/opt/wasi-sdk-16.0/bin:$PATH


Second, we'll download and install the Rust toolchain, as follows:

Shell
 
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh


To configure the current shell, we'll need to run:

Shell
 
source "$HOME/.cargo/env"


Finally, we'll install wit-bindgen, as follows:

Shell
 
cargo install --git https://github.com/bytecodealliance/wit-bindgen wit-bindgen-cli


To deploy our Wasm module to SingleStoreDB, we'll use the pushwasm tool. First, we'll clone the GitHub repo to a convenient location:

Shell
 
git clone https://github.com/singlestore-labs/pushwasm


Next, we'll change to the pushwasm directory and build the code, as follows:

Shell
 
cd pushwasm

cargo build --release


A new file should be written to target/release/pushwasm, and this should be added to our PATH variable.

NOTE: We may also need to run the following to ensure a successful pushwasm build:

Shell
 
sudo apt install libssl-dev


Initialize the Source Tree

Next, let's create a new directory called workdir in our home folder:

 
cd

mkdir workdir

cd workdir


From the workdir, we'll now create a skeletal Rust source tree, as follows:

Shell
 
cargo init --vcs none --lib


We'll need to add wasm32-wasi to the Rust toolchain as it is not installed by default:

Shell
 
rustup target add wasm32-wasi


Create the .wit File

The code we'll use below for our Wasm UDF is also available on GitHub.

In our workdir, we'll now create a file called sentimentable.wit that contains the interface definition. In this file, we'll add the following:

Plain Text
 
record polarity-scores {
    compound: float64,
    positive: float64,
    negative: float64,
    neutral: float64,
}

sentimentable: func(input: string) -> list<polarity-scores>


We'll define a function sentimentable that will take a string, perform sentiment analysis on that string and return a list of polarity scores.

Implement and Compile

In our workdir we'll replace the existing contents of Cargo.toml with the following code:

Plain Text
 
[package]
name = "sentimentable"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
wit-bindgen-rust = { git = "https://github.com/bytecodealliance/wit-bindgen.git", rev = "60e3c5b41e616fee239304d92128e117dd9be0a7" }
vader_sentiment = { git = "https://github.com/ckw017/vader-sentiment-rust" }
lazy_static = "1.4.0"

[lib]
crate-type = ["cdylib"]


Now we need to add the code for sentimentable, so we'll navigate to the src directory in our workdir and locate the lib.rs file. In the lib.rs file, we'll replace the existing contents with the following code:

Rust
 
wit_bindgen_rust::export!("sentimentable.wit");
use crate::sentimentable::PolarityScores;
struct Sentimentable;
impl sentimentable::Sentimentable for Sentimentable {

    fn sentimentable(input: String) -> Vec<PolarityScores> {
        lazy_static::lazy_static! {
            static ref ANALYZER: vader_sentiment::SentimentIntensityAnalyzer<'static> =
                vader_sentiment::SentimentIntensityAnalyzer::new();
        }

        let scores = ANALYZER.polarity_scores(input.as_str());
        vec![PolarityScores {
            compound: scores["compound"],
            positive: scores["pos"],
            negative: scores["neg"],
            neutral: scores["neu"],
        }]
    }
}


Our code uses VADER (Valence Aware Dictionary and sEntiment Reasoner). VADER is a lexicon and rule-based sentiment analysis tool that can interpret and classify emotions.

Next, we'll go back up one directory level:

Shell
 
cd ..


We'll now build the Wasm module:

Shell
 
cargo build --target wasm32-wasi --release


A new Wasm file should be written to target/wasm32-wasi/release/sentimentable.wasm.

Deploy

From our MySQL CLI client, we'll create a new database and switch to it, as follows:

SQL
 
CREATE DATABASE demo;

USE demo;


Next, from the command line, we'll use the pushwasm tool to push our Wasm module into SingleStoreDB, as follows:

Shell
 
pushwasm --tvf \
         --prompt \
         --wit ./sentimentable.wit \
         mysql://admin@<TO DO>/demo \
         ./target/wasm32-wasi/release/sentimentable.wasm \
         sentimentable


The <TO DO> for the host being replaced with the value obtained from SingleStoreDB Cloud. We'll be prompted for the password we created earlier when creating the workspace.

After a short time, we should see the following message:

Plain Text
 
Wasm TVF 'sentimentable' was created successfully.


From our MySQL CLI client, we can also check that the function was created using:

SQL
 
SHOW FUNCTIONS;


Run in the Database

We can quickly test our function, as follows:

SQL
 
SELECT * FROM sentimentable('The movie was great');


The result should be similar to the following:

Plain Text
 
+------------------+------------------+----------+-----------------+
| compound         | positive         | negative | neutral         |
+------------------+------------------+----------+-----------------+
| 0.62489332693894 | 0.57746478873239 |        0 | 0.4225352112676 |
+------------------+------------------+----------+-----------------+
1 row in set (0.10 sec)


VADER can consider capitalization, so we can try:

SQL
 
SELECT * FROM sentimentable('The movie was GREAT!');


The result should be similar to the following:

Plain Text
 
+------------------+------------------+----------+-----------------+
| compound         | positive         | negative | neutral         |
+------------------+------------------+----------+-----------------+
| 0.72902590497990 | 0.63076923076923 |        0 | 0.3692307692307 |
+------------------+------------------+----------+-----------------+
1 row in set (0.11 sec)


We can see that the values have changed, showing a stronger positive sentiment expressed by capitalization.

Bonus: Use the Wasm UDF With Real Sentiment Data

We can try a larger-scale test of the Wasm UDF using real sentiment data. One popular dataset is the Large Movie Review Dataset:

Plain Text
 
@InProceedings{maas-EtAl:2011:ACL-HLT2011,
  author    = {Maas, Andrew L.  and  Daly, Raymond E.  and  Pham, Peter T.  and  Huang, Dan  and  Ng, Andrew Y.  and  Potts, Christopher},
  title     = {Learning Word Vectors for Sentiment Analysis},
  booktitle = {Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies},
  month     = {June},
  year      = {2011},
  address   = {Portland, Oregon, USA},
  publisher = {Association for Computational Linguistics},
  pages     = {142--150},
  url       = {http://www.aclweb.org/anthology/P11-1015}
}


We'll download a compressed file from GitHub and extract the CSV file. The CSV file contains 25,000 rows consisting of two columns:

  1. Text
  2. Sentiment (0 = positive, 1 = negative)

In our MySQL CLI client, we'll create a table:

SQL
 
USE demo;

CREATE TABLE imdb_reviews (
   text TEXT,
   sentiment INT
);


We'll now load the data into SingleStoreDB:

SQL
 
USE demo;

LOAD DATA LOCAL INFILE '/path/to/imdb-reviews.csv'
INTO TABLE imdb_reviews
IGNORE 1 LINES
COLUMNS TERMINATED BY ','
ENCLOSED BY '"';


We'll replace /path/to/ with the actual path to the CSV file.

Once the data are loaded, we can run queries such as the following:

SQL
 
SELECT SUBSTRING(i.text, 1, 30) AS text,
  FORMAT(s.compound, 3) cpd,
  FORMAT(s.positive, 3) pos,
  FORMAT(s.negative, 3) neg,
  FORMAT(s.neutral, 3) ntrl
FROM imdb_reviews i, sentimentable(i.text) s
LIMIT 10;


The result should be similar to the following:

Plain Text
 
+--------------------------------+--------+-------+-------+-------+
| text                           | cpd    | pos   | neg   | ntrl  |
+--------------------------------+--------+-------+-------+-------+
| This is an interesting left tu | 0.996  | 0.126 | 0.060 | 0.813 |
| I don't think any movie of Van | 0.976  | 0.242 | 0.105 | 0.653 |
| This has to be THE WORST film  | -0.572 | 0.103 | 0.129 | 0.768 |
| When I saw this trailer on TV  | 0.968  | 0.155 | 0.014 | 0.831 |
| As an adult I really did enjoy | 0.896  | 0.209 | 0.046 | 0.744 |
| Another example of the unique  | 0.979  | 0.280 | 0.000 | 0.720 |
| This film was a waste of time, | 0.678  | 0.126 | 0.073 | 0.801 |
| I thought this was a splendid  | 0.988  | 0.142 | 0.085 | 0.774 |
| I can only believe that Garson | 0.968  | 0.244 | 0.119 | 0.637 |
| Gods...where to start. I was o | -0.956 | 0.092 | 0.131 | 0.777 |
+--------------------------------+--------+-------+-------+-------+
10 rows in set (0.28 sec)


Next Steps

The Code Engine — Powered by Wasm website is a great place to learn more about Wasm support in SingleStoreDB.

A Wasm tutorial is also available that works through several examples in various programming languages. Also, check out this video about Code Engine Powered by Wasm.

Summary

Wasm support in SingleStoreDB Cloud is available today and can be used by anyone. Through the worked example in this article, we have seen that Wasm provides great potential to extend the capabilities of the database engine in a safe and controlled way. It also opens up the possibilities to reuse code that developers may have built in other programming languages.

Acknowledgments

I thank my colleague Peter Vetere for his assistance and patience during the development of the code example in this article.

Command-line interface Database MySQL Plain text Sentiment analysis Software development kit WebAssembly application shell Data Types

Published at DZone with permission of Akmal Chaudhri. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Non-blocking Database Migrations
  • Keep Your Application Secrets Secret
  • CockroachDB TIL: Volume 11
  • Deploy a Hyperledger Fabric v2 Web App Using the Node.js SDK

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • [email protected]

Let's be friends: