Gotten error message as:
Error: java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Long
The original pig script is as:
score = FOREACH score_account_kpi_avro
GENERATE FLATTEN(STRSPLIT(uid,’:’)) as (account_id:chararray,
date_sk:chararray, index:long), (double)predictedLabel, (double)predictedProb; — (xxxxxxxx,2016-09-30,221905,221905.0,221905.6822910905)
Up to this stage, if you dump some examples, it will be fine. But if proceed joining other data or computing something, you’ll get the error of “java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Long”, which might be hard to tell why it happens.
What is happening here is when you do the split, you can’t cast one of the split entry into long directly (index:long). The right way is to just get it as chararray type, and cast it in the downstrem process, for example:
score = FOREACH score GENERATE account_id AS account_id, (double)index as index,
(double)predictedLabel as predictedLabel, (double)predictedProb as predictedProb;
I wanted to install the newly released deep learning package “mxnet” on my mac. Here’s the instruction site: http://mxnet.readthedocs.org/en/latest/build.html#building-on-osx
It mostly comes fine, but I did have few problems including some linking error.
One is with ‘libtbb.dylib’, it keep complaining that it couldn’t find the lib, but when I check it it is in the right folder `/usr/local/lib` — which is actually a soft link to “/usr/local/Cellar/tbb/4.4-20150728/lib/”. The problem is actually because of the false configuration in opencv.pc. So what I did was to open “/usr/local/lib/pkgconfig/opencv.pc” (which provides the meta-information for pkg-config) and change -llibtbb.dylib to -ltbb.
I also got other few linking errors for libJPEG.dylib, libtiff.dylib and libpng.dylib. What I found is that they points to few libs like “/usr/local/Cellar/jpeg/8d/lib/libjpeg.dylib” or “/usr/local/Cellar/libtiff/4.0.6/lib/libtiff.dylib” but it seems that they are not the ones expected.
To fix this:
# creates the locate database if it does not exist, this may take a longer time, so be patient
sudo launchctl load -w /System/Library/LaunchDaemons/com.apple.locate.plist
#do locate to locate the actual lib, for example
# suppose you got the path from the above command as abspath_to_lib, if the lib already exist in /usr/local/lib, you can remove it first.
ln -s abspath_to_lib /usr/local/libJPEG.dylib
Now, you can run one mnist example by `python example/image-classification/train_mnist.py`. It should display the following results:
I was listening to Hinton’s interview (on CBC Radio: http://nvda.ly/OioP3). He mentioned multiple times of possible break through on natural language understanding by using deep learning technology. It is definitely true that human reasoning is such a difficult task to modeling as it is so complex to be abstracted easily. While I watch my little boy grows, I was amazed every time he shows a new ability, ability to do something, and ability to understand/perceive something. When training my own model (on image instead), I start to gain more understanding of the model. Structure determines the function. In most cases, the training is more like a process of “trial and error”. It’s a big black box with complex structures and connections. One of the biggest advantage of such learning network is its ability to automatically learn the representation, or say to abstract things. With abstraction in our logical system, we are able to organize things, dissect things, compose things, and possibly to create new things. Given what the network can already see/imaging (http://goo.gl/A1sL8N), it’s likely down the few years later, a network on human language could help us to translate the languages that went extinct thousands years by simply seeing over and over those scripts. This would be so wonderful cause so many ancient civilization will start shine again. Maybe I should call this “Forward to the Past”.
Original post: https://coderwall.com/p/ohk6cg/remote-access-to-ipython-notebooks-via-ssh
remote$ipython notebook --no-browser --port=8889
local$ssh -N -f -L localhost:8888:localhost:8889 remote_user@remote_host
To close the SSH tunnel on the local machine, look for the process and kill it manually:
local_user@local_host$ ps aux | grep localhost:8889 local_user 18418 0.0 0.0 41488 684 ? Ss 17:27 0:00 ssh -N -f -L localhost:8888:localhost:8889 remote_user@remote_host local_user 18424 0.0 0.0 11572 932 pts/6 S+ 17:27 0:00 grep localhost:8889 local_user@local_host$ kill -15 18418
Alternatively, you can start the tunnel without the -f option. The process will then remain in the foreground and can be killed with ctrl-c.
On the remote machine, kill the IPython server with ctrl-c ctrl-c.
Note: If you are running GPU & Theano on your remote machine, you can launch the notebook by:
THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 ipython notebook –no-browser –port=8889
Another simple way is to do the following (adding ip=*):
# In the remote server
$ THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 ipython notebook –no-browser –ip=* –port=7777
then you can reach the notebook from http:// the-ip-address-of-your-remote-server:7777/
The current benchmark on visual recognition task:
The NYU Large Scale Machine Learning course looks like it will be very worthwhile to follow. The instructors, John Langford and Yann Le Cun, are both key figures in the machine learning field – for instance, the former developed Vowpal Wabbit and the latter has done pioneering work in deep learning. It is not an online course like those at Coursera et al., but they have promised to put lecture videos and slides online. I’ll certainly try to follow along as best I can.
There is an interesting Innocentive challenge going on, called “Identify Organisms from A Stream of DNA Sequences.” This is interesting to me both because of the subject matter (classification based on DNA sequences) and also because the winner is explicitly required to submit an efficient, scalable solution (not just a good classifier.) Also, the prize sum is one million US dollars! It’s…
View original post 32 more words