Sunday, January 01, 2012

New Year, New Post

another new year

Saturday, November 26, 2011

Thanksgiving

 busy with new member
ends 26 game streak

Monday, September 26, 2011

Self-Organizing Map (SOM)

SOM is a data visualization technology that reduces the dimensions of data through the use of self-organizing neural network, to help us to understand the high dimensional data.

Initialize the map with random weight vectors.
for t = 0 ~ 1
    select a sample randomly from the set of training data
    every node is examined and find the best match unit   --- (a)
    choose neighbors and scale neighbors                          --- (b)
    increase t
end

(a): go through all the weight vector and calculate the distance of each weight to the sample.

(b): different methods to choose neighbors, such like Gaussian, or within a radius R. The new value is: current value * (1 –t) + sample vector * t.

Disadvantages: Need a value for each dimension of each sample. It is very computationally expensive.

Monday, August 15, 2011

A good ipad app

Air video

Don’t need to convert the video and copy to ipad. Set up the PC/Mac as a server and watch movie from ipad with wifi.

Search at App Store for free version download (show 4 movies randomly).

Monday, July 25, 2011

Tools

Stanley: Connecticut. Merged with Black&Decker
Black&Decker: in Maryland, merged with Stanley in 2010.
Porter-Cable, DeWalt: subsidiary of Stanley Black &Decker.
Skil: Bosch
Ridgid: a division of Emerson Electric
Dremel: US., sold to Bosch from 1993
Ryobi, Makita: Japan
Milwaukee: in WI, acquired by a HK company in 2005
Craftsman: Sears (Kmart, OSH)
Irwin: from Ohio, and NC.
Husky: used to be under Stanly, sold to Home Depot
Kobalt: Lowes own brand (lauched in 1998 to compete Hustky, Craftsman)
Bessey / Jorgensen (lowes / Home Depot)

-- [update:9/22/2011]
Snap-On: Based on Wisconsin, 2.4B revenue now.
SK: Illinois.
Proto: acquired by Stanley in 1984.
Matco: stands for Mac Allied Tool Company. Ohio.
Gearwrench: belongs to Danaher Corp.
Duralast: Autozone.
Armstrong: Chicago, acquired by Danaher Corp.

[update: 10/10/2011]
Channellock: pliers in PA
Klein: in Illinois, good for electronic tools, and lineman's pliers.
Knipex: Germany, pliers.

Friday, July 08, 2011

New OpenCV book

Packt has recently published its first book on OpenCV titled “OpenCV 2 Computer Vision Application Programming Cookbook”. Written by Robert Laganiere this book contains examples with source codes which teaches you how to program computer vision applications in C++ using the different features of the OpenCV library. http://www.packtpub.com/opencv-2-computer-vision-application-programming-cookbook/book

All the codes in this book are using OpenCV C++ instead of C. It covers many topic of image processing.

Check the sample chapter

Wednesday, July 06, 2011

solve equation in MATLAB

>> syms x y d1 d2 R1 R2 mag1 mag2 mag3;
>> S = solve('x^2+y^2=R1^2', '(d2-x)^2+y^2=R1^2*mag1/mag2','x^2+(d1-y)^2=R1^2*mag1/mag3','x','y','R1');
>> S.x
>> S.y

Monday, June 13, 2011

Setup OpenCV2 in Visual Studio 2010

in project "Property pages":
C/C++ – General – Additional Include Directories:
C:\OpenCV2.2\include;

Liner – General – Additional Library Directories:
C:\OpenCV2.2\lib;

Liner – General – Additional Dependencies:
opencv_core220d.lib;opencv_highgui220d.lib;
opencv_imgproc220d.lib;opencv_features2d220d.lib;
opencv_calib3d220d.lib;

In source code:
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>



Tuesday, May 24, 2011

Data mining (2)

The data classification process includes two steps: Learning and Classification. The class label of each training item is provided, is is called supervised learning. It contrasts with unsupervised learning (clustering), in which the class label of each training item is not known, and the number of set of classes to be learned may not be known in advance either.

Evaluate the classification and prediction method:
Accuracy, speed, robustness (even given noisy data or missing data), scalability (can applied to large amount of data), interpretability

Backpropagation (BP) is a neural network learning algorithm. The advantages of neural networks include the high tolerance of noisy data as well as the ability to classify patterns on which they have not been trained.

A multiplayer neural network includes input layer, hidden layer(s), and output layer. We call it two-layer neural network if there are only there three layers (input layer is not counted because it serves only to pass the input values to the next layer). If it contains two hidden layers, it is called a three-layer neural network.

Before training begins, we have to decide: number of units in input layer, number of hidden layers, number of units in each hidden layer, and the number of units in output layer. Normalize the input data will speed up the learning process.

SVM uses a nonlinear mapping to transform the original training data into a higher dimension. Within this new dimension, it searches the linear optimal separating hyperplane (decision boundary- separate the items of one class from another). With an appropriate nonlinear mapping to a sufficiently high dimension, data from two classes can always to be separated by a hyperplane. SVM finds this hyperplane using support vectors (some “essential” training items) and margins (defined by the support vectors). SVM searches for hyperplane with the largest margin, that is the maximum marginal hyperplane (MMH). The complexity of the learned classifier is decided by the number of support vectors rather than the dimensionality of the data. Hence, SVM is less sensitive to overfitting than other method. The support vectors are essential or critical training items, they lie closet to the decision boundary (MMH). If all the other training items are removed and repeat the training process, we get the same separating hyperplane. For nonlinar SVM, we can get it by extending the approach for linear SVM: first, transform the original input data into a higher dimensional space using a nonlinear mapping. 2nd, search for a linear separating hyperplane in the new spance. For example, a 3D input vector X={x1,x2,x3} is mapped to a 6D space Z using the mappings \phi_1(X)=x1, \phi_1(X)=x2… \phi_4(X)=(x1)^2, \phi_5(X)=x1x2, \phi_6(X)=x1x3. A decision hyperplane in the new spance is d(X)=WZ+b. Instead of computing the dot product on the transformed data items, it turns out that is is mathematically equivalent to apply a kernel function K(Xi, Xj)=\phi(Xi).\phi(Xj) – In other word, every \phi(Xi).\phi(Xj)  appears in the training algorithm, we can replace it with a kernel function  K(Xi, Xj). The the calculations are made in the original input space, which is much lower dimensionality.

Friday, May 20, 2011

Thursday, May 19, 2011

Cross-validation

cross-validation

In k-fold cross validation, the initial data are randomly partitioned into k mutually exclusive subsets or 'folds', D1, D2, ...Dk, each of approximately equal size. Training and testing is performed k times. In iteration i, partition Di is reserved as the test set, and the remaining partitions are collectively used to train the model. So each sample is used the same number of times for training and once for testing. For classification, the accuracy estimate is the overall number of correct classification from the k iterations, divided by the total number of items in the initial data.
In general, 10-fold cross-validation is recommended for estimating accuracy due to its relatively low bias and variance.