This intructions will lead you to step by step operations for the workshop introducing the NoSQL Databases technologies.
It doesn't matter if you join our workshop live or you prefer to do at your own pace, we have you covered. In this repository, you'll find everything you need for this workshop:
To get the participation certificate, you have to complete following steps:
- Complete the practice steps of this workshop as explained below, steps I-IV are mandatory, step V is optional. Make the screenshots of the steps (1-2 per step).
- Complete try-it-out scenario and make a screenshot of the "scenario completed" screen
- Submit the practice here attaching the screenshots.
ASTRA
is the simplest way to run Cassandra with zero operations at all - just push the button and get your cluster. No credit card required, $25.00 USD credit every month, roughly 5M writes, 30M reads, 40GB storage monthly - sufficient to run small production workloads.
β
Register (if needed) and Sign In to Astra https://astra.datastax.com: You can use your Github
, Google
accounts or register with an email
.
Make sure to chose a password with minimum 8 characters, containing upper and lowercase letters, at least one number and special character
β Create a "pay as you go" plan
Follow this guide, to set up a pay as you go database with a free $25 monthly credit.
- Select the pay as you go option: Includes $25 monthly credit - no credit card needed to set up.
You will find below which values to enter for each field.
-
For the database name -
nosql_db.
While Astra allows you to fill in these fields with values of your own choosing, please follow our recommendations to ensure the application runs properly. -
For the keyspace name -
nosql1
. It's really important that you use the name "nosql1" for the code to work.
You can technically use whatever you want and update the code to reflect the keyspace. This is really to get you on a happy path for the first run.
-
For provider and region: Choose and provider (either GCP or AWS). Region is where your database will reside physically (choose one close to you or your users).
-
Create the database. Review all the fields to make sure they are as shown, and click the
Create Database
button.
You will see your new database pending
in the Dashboard.
The status will change to Active
when the database is ready, this will only take 2-3 minutes. You will also receive an email when it is ready.
In a tabular database we will store ... tables ! The Astra Service is based on Apache Cassandra which is tabular it make sense to start by this one.
Tabular databases organize data in rows and columns, but with a twist from the traditional RDBMS. Also known as wide-column stores or partitioned row stores, they provide the option to organize related rows in partitions that are stored together on the same replicas to allow fast queries. Unlike RDBMSs, the tabular format is not necessarily strict. For example, Apache Cassandraβ’ does not require all rows to contain values for all columns in the table. Like Key/Value and Document databases, Tabular databases use hashing to retrieve rows from the table. Examples include: Cassandra, HBase, and Google Bigtable.
β 2a. Describe your Keyspace
At Database creation you provided a keyspace, a logical grouping for tables let's visualize it. In Astra go to CQL Console to enter the following commands
describe keyspaces;
β 2b. Create tables
- Execute the following Cassandra Query Language
use nosql1;
CREATE TABLE IF NOT EXISTS videos (
videoid uuid,
email text,
title text,
upload timestamp,
url text,
tags set <text>,
frames list<int>,
PRIMARY KEY (videoid)
);
- Visualize structure
describe keyspace nosql1;
β 2c. Working with DATA :
- Insert some entries on first table
INSERT INTO videos(videoid, email, title, upload, url, tags, frames)
VALUES(uuid(),
'[email protected]',
'Introduction to Nosql Databases',
toTimeStamp(now()),
'https://www.youtube.com/watch?v=cMDHxsGbmC8',
{ 'nosql','workshop','2021'},
[ 1, 2, 3, 4]);
INSERT INTO videos(videoid, email, title, upload, url)
VALUES(uuid(),
'[email protected]',
'Demo video Y',
toTimeStamp(now()),
'https://www.youtube.com/watch?v=XXXX');
INSERT INTO videos(videoid, email, title, upload, url)
VALUES(e466f561-4ea4-4eb7-8dcc-126e0fbfd573,
'[email protected]',
'Yet another video',
toTimeStamp(now()),
'https://www.youtube.com/watch?v=ABCDE');
- Read values
select * from videos;
- Read by id
select * from videos
where videoid=e466f561-4ea4-4eb7-8dcc-126e0fbfd573;
β 2d. Working with PARTITIONS :
But data can be grouped, we stored together what should be retrieved together.
- Create This new table
CREATE TABLE IF NOT EXISTS users_by_city (
city text,
firstname text,
lastname text,
email text,
PRIMARY KEY ((city), lastname, email)
) WITH CLUSTERING ORDER BY(lastname ASC, email ASC);
- Insert some entries
INSERT INTO users_by_city(city, firstname, lastname, email)
VALUES('PARIS', 'Lunven', 'Cedrick', '[email protected]');
INSERT INTO users_by_city(city, firstname, lastname, email)
VALUES('PARIS', 'Yellow', 'Jackets', '[email protected]');
INSERT INTO users_by_city(city, firstname, lastname, email)
VALUES('ORLANDO', 'David', 'Gilardi', '[email protected]');
- List values with partitions keys
SELECT * from users_by_city WHERE city='PARIS';
- List values without parititions keys
SELECT * from users_by_city WHERE lastname='Gilardi';
Yes we know
and we will explain why.
- Same query with debugging enabled
tracing on;
SELECT * from users_by_city WHERE city='PARIS';
- Forcing Full Scan = SLOW
SELECT * from users_by_city WHERE lastname='Gilardi' ALLOW FILTERING;
- Stop debugging
tracing off;
Let's do some hands-on with document database queries.
Document databases expand on the basic idea of key-value stores where βdocumentsβ are more complex, in that they contain data and each document is assigned a unique key, which is used to retrieve the document. These are designed for storing, retrieving, and managing document-oriented information, often stored as JSON. Since the Document database can inspect the document contents, the database can perform some additional retrieval processing. Unlike RDBMSs which require a static schema, Document databases have a flexible schema as defined by the document contents. Examples include: MongoDB and CouchDB. Note that some RDBMS and NoSQL databases outside of pure document stores are able to store and query JSON documents, including Cassandra.
β 3a. Cassandra knows JSON :
It is not a known fact but Cassandra accepts JSON query out of the box. You can find more information here
- Insert data into Cassandra with JSON
INSERT INTO videos JSON '{
"videoid":"e466f561-4ea4-4eb7-8dcc-126e0fbfd578",
"email":"[email protected]",
"title":"A JSON videos",
"upload":"2020-02-26 15:09:22 +00:00",
"url": "http://google.fr",
"frames": [1,2,3,4],
"tags": [ "cassandra","accelerate", "2020"]
}';
In the same way you can retrieve JSON out of Cassandra (more info here)
- Retrieve data from Cassandra as JSON
select json title,url,tags from videos;
β 3b. Create your Application token :
Use this documentation to create your application token. Copy the token value in a text file somewhere we will reuse it a lot
Astra Token format (do not copy)
AstraCS:KDfdKeNREyWQvDpDrBqwBsUB:ec80667c....
This walkthrough has been realized using the Quick Start
Next go the connect page, locate the SWAGGER URL
Locate the Document part in the Swagger UI.
β 3c Create a new empty collection :
- Access Create a new empty collection in a namespace
- Click
Try it out
button - Fill Header
X-Cassandra-Token
with<your_token>
- For
namespace-id
usenosql1
- For
body
use
{ "name": "col1" }
- Click
Execute
button
You will get a 201 returned code
β 3d. Create a new document :
- Access Create a new document
- Click
Try it out
button - Fill with Header
X-Cassandra-Token
with<your_token>
- For
namespace-id
usenosql1
- For
collection-id
usecol1
- For
body
use
{
"videoid":"e466f561-4ea4-4eb7-8dcc-126e0fbfd573",
"email":"[email protected]",
"title":"A Second videos",
"upload":"2020-02-26 15:09:22 +00:00",
"url": "http://google.fr",
"frames": [1,2,3,4],
"tags": [ "cassandra","accelerate", "2020"],
"formats": {
"mp4": {"width":1,"height":1},
"ogg": {"width":1,"height":1}
}
}
- Click
Execute
button
ποΈ Expected output
{
"documentId":"5d746e40-97cf-490b-ab0d-68cfbc5d2ef3"
}
You can add a couple of documents changing values, new documents with new ids will be generated
β 3e Find all documents of a collection :
- Access Search documents in a collection
- Click
Try it out
button - Fill Header
X-Cassandra-Token
with<your_token>
- For
namespace-id
usenosql1
- For
collection-id
usecol1
Let other fields blank every query is paged in Cassandra.
ποΈ Expected output:
{
"data": {
"84bd6ebc-a274-4dc3-ae7c-eb2fd913331b": {
"email": "[email protected]",
"formats": {
"mp4": {
"height": 1,
"width": 1
},
"ogg": {
"height": 1,
"width": 1
}
},
"frames": [
1,
2,
3,
4
...
- Click
Execute
button
β 3f Retrieve a document from its id :
- Access Get a document
- Click
Try it out
button - Fill Header
X-Cassandra-Token
with<your_token>
- For
namespace-id
usenosql1
- For
collection-id
usecol1
- For
document-id
use<doc_id_in_previous_call>
- Click
Execute
button
ποΈ Expected output:
{
"documentId": "84bd6ebc-a274-4dc3-ae7c-eb2fd913331b",
"data": {
"email": "[email protected]",
"formats": {
"mp4": {
"height": 1,
"width": 1
},
"ogg": {
"height": 1,
"width": 1
}
},
"frames": [
1,
2,
3,
4
],
"tags": [
"cassandra",
"accelerate",
"2020"
],
"title": "A Second videos",
"upload": "2020-02-26 15:09:22 +00:00",
"url": "http://google.fr",
"videoid": "e466f561-4ea4-4eb7-8dcc-126e0fbfd573"
}
}
β 3g Search document from a where clause :
- Access Search documents in a collection
- Click
Try it out
button - Fill Header
X-Cassandra-Token
with<your_token>
- For
namespace-id
usenosql1
- For
collection-id
usecol1
- For
where
use{"email": { "$eq":"[email protected]" } }
- Click
Execute
button
ποΈ Expected output
{
"data": {
"84bd6ebc-a274-4dc3-ae7c-eb2fd913331b": {
"email": "[email protected]",
"formats": {
"mp4": {
"height": 1,
"width": 1
},
"ogg": {
"height": 1,
"width": 1
}
},
"frames": [
1,
2,
3,
4
],
"tags": [
"cassandra",
"accelerate",
"2020"
],
"title": "A Second videos",
"upload": "2020-02-26 15:09:22 +00:00",
"url": "http://google.fr",
"videoid": "e466f561-4ea4-4eb7-8dcc-126e0fbfd573"
}
}
}
Key/Value databases are some of the least complex as all of the data within consists of an indexed key and a value. Key-value databases use a hashing mechanism such that given a key, the database can quickly retrieve an associated value. Hashing mechanisms provide constant time access, which means they maintain high performance even at large scale. The keys can be any type of object, but are typically a string. The values are generally opaque blobs (i.e., a sequence of bytes that the database does not interpret). Examples include: Redis, Amazon DynamoDB, Riak, and Oracle NoSQL database. Some tabular NoSQL databases, like Cassandra, can also service key/value needs.
β 4b. Create a table with GraphQL
Use this query
mutation {
kv: createTable(
keyspaceName:"nosql1",
tableName:"key_value",
partitionKeys: [ # The keys required to access your data
{ name: "key", type: {basic: TEXT} }
]
values: [ # The values associated with the keys
{ name: "value", type: {basic: TEXT} }
]
)
}
You can check in the CQL COnsole as well;
use nosql1;
describe table key_value;
Expected output
β 4c. Populate the table
Any of the created APIs can be used to interact with the GraphQL data, to write or read data.
Navigate to the GraphQL playground, by clicking Connect
, then GraphQL API
, and finally you should see the URL under Launching GraphQL Playground
. Ctrl-click or right-click the link to launch into a new tab.
First, letβs navigate to your new keyspace nosql1
inside the playground. Change tab to graphql and pick url /graphql/nosql1
- Fill the header token again
{
"x-cassandra-token":"AstraCS:fjlsgehrre;ge"
}
- Execute this query
mutation insert2KV {
key1: insertkey_value(value: {key:"key1", value:"bbbb"}) {
value {
key,value
}
}
key2: insertkey_value(value: {key:"key2", value:"bbbb"}) {
value {
key,value
}
}
}
- Check with CQL CConsole
select * from key_value;
Expected output:
- Execute this query
mutation insert2KV {
key1: insertkey_value(value: {key:"key1", value:"cccc"}) {
value {
key,value
}
}
}
- Check with CQL Console the values should have been updated. Indeed with Cassandra no integrite constraints so you can de-duplicate values easily,
Graph databases store their data using a graph metaphor to exploit the relationships between data. Nodes in the graph represent data items, and edges represent the relationships between the data items. Graph databases are designed for highly complex and connected data, which outpaces the relationship and JOIN capabilities of an RDBMS. Graph databases are often exceptionally good at finding commonalities and anomalies among large data sets. Examples of Graph databases include DataStax Graph, Neo4J, JanusGraph, and Amazon Neptune.
Astra does not contain yet a way to implement Graph Databases use cases. But at Datastax Companny we do have a product called DataStax Graph that you can use for free when not in production.
Today it will be a demo to be quick but you can as well do and start the demo with the following steps
β 5a. Prerequisites
Minimal Configuration: You need to have a computer with this minimal configuration requirements
- At least 2CPU
- At least 3GB or RAM
Install Docker and Docker Compose
You tneed to install Docker and Docker-compose on your machine
β 5b. Create a docker network named 'graph'
docker network create graph
π₯οΈ Expected output
$workshop_introduction_to_nosql> docker network create graph
64f8bcc2dda416d6dc80ef3c1ac97902b9d90007842808308e9d741d179d9344
β 5c.Clone this repository (or download ZIP from the github UI)
git clone https://github.com/datastaxdevs/workshop-introduction-to-nosql.git
cd workshop-introduction-to-nosql
β 5d.Start the containers
docker-compose up -d
π₯οΈ Expected output
$workshop_introduction_to_nosql> docker-compose up -d
Creating dse ... done
Creating workshop-introduction-to-nosql_studio_1 ... done
Wait for the application to start (30s) and open http://localhost:9091
β 5e.Check database connection
Open the ellipsis and click Connections
Select the default localhost
connection
Check that dse
is set for the host (pointing to a local cassandra)
Click the button Test
and expect the output Connected Successfully
β 5f. Open the notebook Work
Use the ellipsis to now select Notebooks
Once the notebook open it has you to creat the graph click Create
button
Execute cell after call spotting the Real Time >
button in each cell (top right)
Voila !
β 5g. Close Notebook
To close open notebooks you can now use
docker-compose down
π₯οΈ Expected output
$workshop_introduction_to_nosql> docker-compose down
Stopping workshop-introduction-to-nosql_studio_1 ... done
Stopping dse ...
Congratulation your made it to the END.