Jayati Tiwari

Channel: Jayati Tiwari

Image may be NSFW.
Clik here to view.

Feature comparison of Machine Learning Libraries

April 21, 2015, 10:11 pm

Machine learning is a subfield of computer science stemming from research into artificial intelligence. It is a scientific discipline that explores the construction and study of algorithms that can...

View Article

Common Functions in R

April 27, 2015, 3:57 am

Google "What's R", and you'll see there are many ways in which R has been defined. As per my understanding, firstly it's a programming language. Secondly, it's solely meant for statistical computing....

View Article

Installing R on Linux

June 15, 2015, 10:41 am

I was a bit skeptical about writing this post due to the scarcity of content, but anyhow you know what I chose.So, installing R on Ubuntu is all about the following two steps:sudo apt-get install...

View Article

Socket Programming in Java

June 15, 2015, 10:52 am

A socket literally means an electrical device receiving a plug or light bulb to make a connection. And in computer programming it means a method of communication between two programs one acting as the...

View Article

Image may be NSFW.
Clik here to view.

Installing H2O and Running ML Implementations of H2O

June 15, 2015, 10:52 am

H2O is an open source predictive analytics platform. Unlike traditional analytics tools, H2O provides a combination of extraordinary math and high performance parallel processing with unrivaled ease of...

View Article

Running Naive Bayes Classification algorithm using Weka

June 15, 2015, 10:53 am

Wiki says, "Naive Bayes is a simple technique for constructing classifiers: models that assign class labels to problem instances, represented as vectors of feature values, where the class labels are...

View Article

Installing sparkling-water and Running sparkling-water's Deep Learning

June 15, 2015, 10:54 am

Sparkling Water is designed to be executed as a regular Spark application. It provides a way to initialize H2O services on each node in the Spark cluster and access data stored in data structures of...

View Article

Installing SparkMLlib on Linux and Running SparkMLlib implementations

June 15, 2015, 10:55 am

SparkMLlib is a machine learning library which ships with Apache Spark and can run on any Hadoop2/YARN cluster without any pre-installation. It is Spark’s scalable machine learning library consisting...

View Article

Install and run Augustus on CentOS

June 15, 2015, 11:33 am

Hello Folks .. If you are visiting this blog you definitely know what Augustus is all about, but still for any exceptions, here’s its short introduction taken directly from its makers:“Augustus is an...

View Article

Installation Script for Apache Zookeeper-3.3.5 on Linux

June 26, 2015, 9:58 am

One of my previous blogs describes how to setup a Zookeeper cluster manually. Here's a quick fix: an installation script for the same. You need to run the following script (after storing the content in...

View Article

Start-up script for an installed Apache Zookeeper Cluster

June 26, 2015, 10:04 am

If you have an installed Zookeeper-3.3.5 cluster, this script will save you from manually visiting each node and starting the zkServer there. All you have to do is grab a remote machine and run the...

View Article

Installation Script for Apache Storm on Ubuntu

June 26, 2015, 11:53 am

One of my blogs here, describes steps for manual installation of a Storm cluster. To intensify the convenience factor for you, here's an installation script that you can use for setting up a Storm...

View Article

Installation Script for Apache Storm on CentOS

June 26, 2015, 12:04 pm

CentOS and Ubuntu and two famous Linux distribution used pretty widely. My last post shares an installation script for Storm cluster over Ubuntu machines and this one is for CentOS. The few usage rules...

View Article

Start-up script for an installed Apache Storm Cluster

June 26, 2015, 12:08 pm

If you have installed a Storm cluster using my shell scripts in the previous blogs or even otherwise, this script will save you from manually visiting each node and starting the appropriate...

View Article

Installing Hadoop-1.x.x in Pseudo-Distributed Mode

July 16, 2015, 11:10 am

Disclaimer: The installation steps shared in this blog post are typically for the hadoop-1.x.x series. If you are looking for hadoop-2.x.x series installation steps i.e. with YARN, this post isn’t the...

View Article

Image may be NSFW.
Clik here to view.

Understanding Data Pre-processing in Mahout – Part I

July 24, 2015, 10:11 am

Two most common commands used for pre-processing of train or test data when running Mahout algorithms are:seqdirectory: Turns raw text in a directory into mahout sequence file.seq2sparse: Creates...

View Article

Image may be NSFW.
Clik here to view.

Understanding Data Pre-processing in Mahout–Part II

July 24, 2015, 10:28 am

In continuation to my previous post where first one of the two commonly used commands for data pre-processing in Mahout is described, we shall continue with the second one i.e. “seq2sparse” in this...

View Article

Image may be NSFW.
Clik here to view.

Writing your first Storm Topology

July 25, 2015, 9:33 am

This blog contains multiple posts on Storm, its installation, shell scripts for installation of a Storm cluster and integration of Storm with HBase and RDBMS. But if you're a newbie to Storm this one's...

View Article

Query Storm Data Streams using Esper

July 25, 2015, 11:41 am

As you might already be aware and as documented on the web, “Esper is a component for complex event processing (CEP) which enables rapid development of applications that process large volumes of...

View Article

Image may be NSFW.
Clik here to view.

Mahout’s Naïve Bayes: Train Phase

July 25, 2015, 1:27 pm

Mahout’s Naïve Bayes Classification algorithm executes in two phases:Train Phase: Trains a model using pre-processed train dataTest Phase: Classify documents (pre-processed) with the help of the model...

View Article

Image may be NSFW.
Clik here to view.

Mahout’s Naïve Bayes: Test Phase

July 25, 2015, 2:09 pm

This post is in continuation to my previous post where Mahout Naive Bayes "trainnb" command has been explained. This one would describe the internal execution steps of the "testnb" command, which is...

View Article

Spark Overview

September 4, 2015, 11:19 am

Spark is a cluster computing framework i.e. a framework which uses multiple workstations, multiple storage devices, and redundant interconnections, to form an abstract single highly available system....

View Article

Image may be NSFW.
Clik here to view.

Setting up Spark-0.7.x in Standalone Mode

September 4, 2015, 11:30 am

A Spark Cluster in Standalone Mode comprises of one Master and multiple Spark Worker processes. Standalone mode can be used both on a single local machine or on a cluster. This mode does not require...

View Article

Setting up a Mesos-0.9.0 Cluster

September 4, 2015, 12:03 pm

Apart from running in Standalone mode, Spark can also run on clusters managed by Apache Mesos. "Apache Mesos is a cluster manager that provides efficient resource isolation and sharing across...

View Article

Deploying the Spark-0.7.x Cluster in Standalone Mode

September 7, 2015, 12:26 pm

To deploy the Spark Cluster in the Standalone Mode, run the following script present in the Spark Setup on the cluster's Master nodebin/start-all.shIf everything is fine, the Spark Master UI should be...

View Article

Latest Images