Introduction to Neo4j graph database and Gephi tool.
What is a Graph Database?
A graph database is a database that uses graph structures for semantic queries with nodes, edges, and properties to represent and store data. A key concept of the system is the graph. The graph relates the data items in the store to a collection of nodes and edges, the edges representing the relationships between the nodes. The relationships allow data in the store to be linked together directly and, in many cases, retrieved with one operation. Graph databases hold the relationships between data as a priority.
What is Neo4j?
Neo4j is an open-source, NoSQL, native graph database that provides an ACID-compliant transactional backend for your applications. Neo4j is referred to as a native graph database because it efficiently implements the property graph model down to the storage level. This means that the data is stored exactly as you whiteboard it, and the database uses pointers to navigate and traverse the graph. In contrast to graph processing or in-memory libraries, Neo4j also provides full database characteristics, including ACID transaction compliance, cluster support, and runtime failover - making it suitable to use graphs for data in production scenarios.
Let's get started with Neo4j database!!
You can download Neo4j according to your system requirement from here. Although here I am going to use online Neo4j Sandbox for demo purpose.
1. First of all sign up for the No4j sandbox.
2. After successful login, you will be asked to create or launch a existing project. Here I am going to use pre built "Graph Data Science" project
3. Click open with browser to open Neo4j Browser.
Your browser screen looks like this
4. Click database icon on top left corner to explore your graph database.
5. You can write your cypher query in query editor at the top.6. Let's use Cypher to generate a small social graph.
a. Creating a new node.
CREATE (ee:Person { name: "Palak", from: "India"})
The CREATE clause is used to create data by specifying named nodes and relationships with inline properties.
- CREATE clause to create data
- () parenthesis to indicate a node
- ee:Person a variable 'ee' and label 'Person' for the new node
- brackets to add properties to the node
b. Finding the node created.
MATCH (ee:Person) WHERE ee.name = "Palak" RETURN ee
- MATCH clause to specify a pattern of nodes and relationships
- (ee:Person) a single node pattern with label 'Person' which will assign matches to the variable 'ee'
- WHERE clause to constrain the results
- ee.name = "Palak" compares name property to the value "Palak"
- RETURN clause used to request particular results
c. Creating a another node and adding relationship.
MATCH (pb:Person) WHERE ee.name = "Palak"
CREATE (th:Person { name: "Tom Hardy", from: "England"}),
(pb)-[:KNOWS {since: 1999}]->(th);
d. View Relationships.
MATCH (pb:Person)-[:KNOWS]-(since) WHERE pb.name = "Palak" RETURN pb, since
- MATCH clause to describe the pattern from known Nodes to found Nodes
- (pb) starts the pattern with a Person (qualified by WHERE)
- -[:KNOWS]-matches "KNOWS" relationships (in either direction)
- (since) will be bound to Palak's
7. You can explore more on cypher query from here.
Let's get started with Gephi!!
What is Gephi?
Gephi is an open-source network analysis and visualization software package written in Java on the NetBeans platform. Gephi is a visualization and exploration software for graphs and networks. Think Photoshop, but for graph data. Gephi is open-source, free to download, and runs on Windows, Mac OS X, and Linux.
1. Install Gephi from here.
2. Start Gephi and select a sample project for demo purpose.
3. Now you can see the graph window.
If you cannot see the graph window, you can enable it from Window->Graphs in menu bar.
4. Using direct selection tool you can highlight all the nodes directly connected to the selected node.
6. Loading graph data in Gephi.
Gephi support CSV file format to import data. You can import nodes using node table, edges using edge table, nodes and edges using adjacency list and adjacency matrix.
For example :
The sample below shows a node table of of three nodes. The column of node identifiers must be named “Id”.
Id
A
B
C
The sample below shows an edge table of of two edge. The columns must be named “Source” and “Target”.
Source,Target
A,B
C,A
The sample below shows an adjacency list of 6 edges and 4 nodes.
q,w,e,r
w,q,e,r
The sample below shows an adjacency matrix of 2 edges and 2 nodes.
,Z,X
Z,0,1
X,1,0
Now creating the above mentioned CSV file and importing it to Gephi.
Importing nodes
Importing nodes
Importing adjacency list
Importing adjacency matrix
Final graph
Final edge table
These are the various ways to store and load graph dataset in Gephi.
Please like, Share and comment.
References:
https://gephi.org/users/supported-graph-formats/spreadsheet/
https://www.youtube.com/watch?v=2FqM4gKeNO4
No comments:
Post a Comment