Archive for December, 2015

Installing mxnet

December 23, 2015 Leave a comment

I wanted to install the newly released deep learning package “mxnet” on my mac. Here’s the instruction site:

It mostly comes fine, but I did have few problems including some linking error.

One is with ‘libtbb.dylib’, it keep complaining that it couldn’t find the lib, but when I check it it is in the right folder `/usr/local/lib` — which is actually a soft link to “/usr/local/Cellar/tbb/4.4-20150728/lib/”. The problem is actually because of the false configuration in opencv.pc. So what I did was to open “/usr/local/lib/pkgconfig/opencv.pc” (which provides the meta-information for pkg-config) and change -llibtbb.dylib to -ltbb.

I also got other few linking errors for libJPEG.dylib, libtiff.dylib and libpng.dylib. What I found is that they points to few libs like “/usr/local/Cellar/jpeg/8d/lib/libjpeg.dylib” or “/usr/local/Cellar/libtiff/4.0.6/lib/libtiff.dylib” but it seems that they are not the ones expected.

Screen Shot 2015-12-23 at 10.56.47 AM

Screen Shot 2015-12-23 at 10.57.30 AM

To fix this:

# creates the locate database if it does not exist, this may take a longer time, so be patient
sudo launchctl load -w /System/Library/LaunchDaemons/

#do locate to locate the actual lib, for example
locate libJPEG.dylib

# suppose you got the path from the above command as abspath_to_lib, if the lib already exist in /usr/local/lib, you can remove it first.
ln -s abspath_to_lib /usr/local/libJPEG.dylib

Now, you can run one mnist example by `python example/image-classification/`. It should display the following results:

Screen Shot 2015-12-23 at 11.20.01 AM.png



Using spark-shell

December 23, 2015 Leave a comment

As a new learner for spark/scala, I found using spark-shell for debugging is very useful. Sometimes, I just feel it like the ipython shell.  There are few tricks of using it:

0. Do ./spark-shell -h will give you a lot of help information

1. Load external file in spark-shell:
spark-shell -i file.scala  or in-shell do
scala> :load your_path_to.scala

2. Remember when you start the shell, the SparkContext(sc) and the SQLContext (sqlContext) has already loaded. If you are not in the spark shell — remember to create such in your program

3. You can import multiple things like this: scala> import org.apache.spark.{SparkContext, SparkConf}

4. You can use `spark-shell -jars your.jar` to run a single-jar spark module from the start, and then you will be able to `import somthing_from_your_jar’ from your just added library.

5. If you install spark locally, you can open it’s web ui(port 4040) for validation purpose: http://localhost:4040/environment/

Screen Shot 2015-12-23 at 11.36.29 AM

6. To re-use what you have entered into the spark-shell, you can extract your input from the spark shell history which is in a file called “.spark-history” in the user’s home directory. For example `tail -n 5 .spark_history > mySession1.scala`. Next time, you can use (1) to reload your saved scala session. In the shell session, if you want to check history, you can simply do `scala> :history`

7. A library called scalaplot can help you to do some visual investigation.

8. Use $ SPARK_PRINT_LAUNCH_COMMAND=1 ./bin/spark-shell to print launch command of spark scripts

9. Open spark-shell and execute :paste -raw that allows you to enter any valid Scala code, even including package.

ps. to install spark on your mac, you can simply use homebrew
$brew update
$brew install scala
$brew install sbt
$echo ‘SBT_OPTS=”-XX:+CMSClassUnloadingEnabled -XX:PermSize=256M -XX:MaxPermSize=512M -Xmx2G”‘ >> ~/.sbtconfig
$brew install apache-spark

After the installation, you can update your PATH variable to include the path to spark/bin.

You can also set up pyspark locally, here are some instructions:

One short but nice Scala book

Categories: Scala, Spark