Install TensorFlow using Anaconda

July 17, 2017 Leave a comment

If you search online, there will be so many posts on how to install TensorFlow on mac, some are older ones, some are new, hard to decide which one to follow. Let’s just make the flow easier so we can focus on the core part to explore the deep learning.

Please note that starting v1.2 TensorFlow no longer support GPU on mac-os, so in the followings, we will just install the CPU version. It’s good to start with something simple. There are two ways of install it under Anaconda.

Use ‘conda’ command which can be quite straight forward.

# Python 2.7
$ conda create -n tensorflow python=2.7

$ source activate tensorflow
(tensorflow)$  # Your prompt should change

# Use 'conda' command: Linux/Mac OS X, Python 2.7/3.4/3.5, CPU only:
(tensorflow)$ conda install -c conda-forge tensorflow

Screen Shot 2017-07-17 at 4.26.00 PM

or use the pip command

(tensorflow)$ pip install --ignore-installed --upgrade \



You can follow up to install keras and ipython within the env:

# note this will downgrade tensor flow to 1.0

conda install -c conda-forge keras=2.0.2

conda install ipython


Categories: Deep learning, MacOS

Pig Error for string to long

October 19, 2016 Leave a comment

Gotten error message as:

Error: java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Long

The original pig script is as:


score = FOREACH score_account_kpi_avro
GENERATE FLATTEN(STRSPLIT(uid,’:’)) as (account_id:chararray,
date_sk:chararray, index:long), (double)predictedLabel, (double)predictedProb; — (xxxxxxxx,2016-09-30,221905,221905.0,221905.6822910905)

Up to this stage, if you dump some examples, it will be fine. But if proceed joining other data or computing something, you’ll get the error of “java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Long”, which might be hard to tell why it happens.

What is happening here is when you do the split, you can’t cast one of the split entry into long directly (index:long). The right way  is to just get it as chararray type, and cast it in the downstrem process, for example:

score = FOREACH score GENERATE account_id AS account_id, (double)index as index,
(double)predictedLabel as predictedLabel, (double)predictedProb as predictedProb;

Categories: Uncategorized

Installing mxnet

December 23, 2015 Leave a comment

I wanted to install the newly released deep learning package “mxnet” on my mac. Here’s the instruction site:

It mostly comes fine, but I did have few problems including some linking error.

One is with ‘libtbb.dylib’, it keep complaining that it couldn’t find the lib, but when I check it it is in the right folder `/usr/local/lib` — which is actually a soft link to “/usr/local/Cellar/tbb/4.4-20150728/lib/”. The problem is actually because of the false configuration in opencv.pc. So what I did was to open “/usr/local/lib/pkgconfig/opencv.pc” (which provides the meta-information for pkg-config) and change -llibtbb.dylib to -ltbb.

I also got other few linking errors for libJPEG.dylib, libtiff.dylib and libpng.dylib. What I found is that they points to few libs like “/usr/local/Cellar/jpeg/8d/lib/libjpeg.dylib” or “/usr/local/Cellar/libtiff/4.0.6/lib/libtiff.dylib” but it seems that they are not the ones expected.

Screen Shot 2015-12-23 at 10.56.47 AM

Screen Shot 2015-12-23 at 10.57.30 AM

To fix this:

# creates the locate database if it does not exist, this may take a longer time, so be patient
sudo launchctl load -w /System/Library/LaunchDaemons/

#do locate to locate the actual lib, for example
locate libJPEG.dylib

# suppose you got the path from the above command as abspath_to_lib, if the lib already exist in /usr/local/lib, you can remove it first.
ln -s abspath_to_lib /usr/local/libJPEG.dylib

Now, you can run one mnist example by `python example/image-classification/`. It should display the following results:

Screen Shot 2015-12-23 at 11.20.01 AM.png


Using spark-shell

December 23, 2015 Leave a comment

As a new learner for spark/scala, I found using spark-shell for debugging is very useful. Sometimes, I just feel it like the ipython shell.  There are few tricks of using it:

0. Do ./spark-shell -h will give you a lot of help information

1. Load external file in spark-shell:
spark-shell -i file.scala  or in-shell do
scala> :load your_path_to.scala

2. Remember when you start the shell, the SparkContext(sc) and the SQLContext (sqlContext) has already loaded. If you are not in the spark shell — remember to create such in your program

3. You can import multiple things like this: scala> import org.apache.spark.{SparkContext, SparkConf}

4. You can use `spark-shell -jars your.jar` to run a single-jar spark module from the start, and then you will be able to `import somthing_from_your_jar’ from your just added library.

5. If you install spark locally, you can open it’s web ui(port 4040) for validation purpose: http://localhost:4040/environment/

Screen Shot 2015-12-23 at 11.36.29 AM

6. To re-use what you have entered into the spark-shell, you can extract your input from the spark shell history which is in a file called “.spark-history” in the user’s home directory. For example `tail -n 5 .spark_history > mySession1.scala`. Next time, you can use (1) to reload your saved scala session. In the shell session, if you want to check history, you can simply do `scala> :history`

7. A library called scalaplot can help you to do some visual investigation.

8. Use $ SPARK_PRINT_LAUNCH_COMMAND=1 ./bin/spark-shell to print launch command of spark scripts

9. Open spark-shell and execute :paste -raw that allows you to enter any valid Scala code, even including package.

ps. to install spark on your mac, you can simply use homebrew
$brew update
$brew install scala
$brew install sbt
$echo ‘SBT_OPTS=”-XX:+CMSClassUnloadingEnabled -XX:PermSize=256M -XX:MaxPermSize=512M -Xmx2G”‘ >> ~/.sbtconfig
$brew install apache-spark

After the installation, you can update your PATH variable to include the path to spark/bin.

You can also set up pyspark locally, here are some instructions:

One short but nice Scala book

Categories: Scala, Spark

Few Python base Deep Learning Libs

June 23, 2015 Leave a comment

Lasagne: light weighted Theano extension, Theano can be used explicitly

Keras: is a minimalist, highly modular neural network library in the spirit of Torch, written in Python, that uses Theano under the hood for fast tensor manipulation on GPU and CPU. It was developed with a focus on enabling fast experimentation.

Pylean2: wrapper for Theano, yaml, experimental oriented.

Caffe: CNN oriented deep learning framework using c++, with python wrapper, easy model definitions using prototxt.

Theano: general gpu math

nolearn: a probably even simpler one

you can find more here.

For Lasagne and nolearn, they are still in the rapid develop stage, so they changes a lot. Be careful with the versions installed, they need to match each other. If you are having problems such as “cost must be a scalar”, you can refer link here to solve it by uninstall and reinstall them.

pip uninstall Lasagne
pip uninstall nolearn
pip install -r

Forward to the past

June 19, 2015 Leave a comment

I was listening to Hinton’s interview (on CBC Radio: He mentioned multiple times of possible break through on natural language understanding by using deep learning technology. It is definitely true that human reasoning is such a difficult task to modeling as it is so complex to be abstracted easily. While I watch my little boy grows, I was amazed every time he shows a new ability, ability to do something, and ability to understand/perceive something. When training my own model (on image instead), I start to gain more understanding of the model. Structure determines the function. In most cases, the training is more like a process of “trial and error”. It’s a big black box with complex structures and connections. One of the biggest advantage of such learning network is its ability to automatically learn the representation, or say to abstract things. With abstraction in our logical system, we are able to organize things, dissect things, compose things, and possibly to create new things. Given what the network can already see/imaging (, it’s likely down the few years later, a network on human language could help us to translate the languages that went extinct thousands years by simply seeing over and over those scripts. This would be so wonderful cause so many ancient civilization will start shine again. Maybe I should call this “Forward to the Past”.

Categories: Uncategorized Tags: ,

Remote access ipython notebooks

February 18, 2015 1 comment

Original post:

remote$ipython notebook --no-browser --port=8889

local$ssh -N -f -L localhost:8888:localhost:8889 remote_user@remote_host

To close the SSH tunnel on the local machine, look for the process and kill it manually:

local_user@local_host$ ps aux | grep localhost:8889
local_user 18418  0.0  0.0  41488   684 ?        Ss   17:27   0:00 ssh -N -f -L localhost:8888:localhost:8889 remote_user@remote_host
local_user 18424  0.0  0.0  11572   932 pts/6    S+   17:27   0:00 grep localhost:8889

local_user@local_host$ kill -15 18418

Alternatively, you can start the tunnel without the -f option. The process will then remain in the foreground and can be killed with ctrl-c.

On the remote machine, kill the IPython server with ctrl-c ctrl-c.

Note: If you are running GPU & Theano on your remote machine, you can launch the notebook by:

THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 ipython notebook –no-browser –port=8889

Another simple way is to do the following (adding ip=*):

# In the remote server

$ THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 ipython notebook –no-browser –ip=* –port=7777

then you can reach the notebook from http:// the-ip-address-of-your-remote-server:7777/

Categories: Uncategorized Tags: