Run your Hadoop MapReduce job on Amazon EMR | The Pragmatic Integrator:
'via Blog this'
Monday, December 30, 2013
Friday, December 27, 2013
Setting up multi-node Hadoop cluster on Mac
On Mac (Lion OS 10.7.5)
- Download
- Launch
- Common for all the four nodes below
- We want to create four nodes below to simulate ideal multi-node cluster. Edge node will have cloudera manager (to install hadoop on the cluster), Eclipse (to develop code), to submit jobs, etc. Namenode will have namenode, secondary namenode and job tracker services. Data nodes will have datanode and task tracker services.
- Open VMWare Fusion and create four virtual machines (VMs) using the Ubuntu image that you downloaded earlier (Click on "Add" >> "New" >> "Install from disc or image" >> "Continue" >> "User another disc or disc image" >> Point to the downloaded Ubuntu image file >> "Customize" based on the memory/processors you have available on your Mac and then give the virtual machine names according to their usage. (You will need to remember the userid+password that you provide here)
- Launch each of the machines
- Login to the machine
- Click on "Dash" >> search for "Terminal" >> Open Terminal
- sudo apt-get install openssh-server (to accept SSH connections)
- ssh-keygen -t rsa -P "" (then hit enter when prompted)
- cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
- chmod 600 ~/.ssh/id_rsa.pub
- Run "ifconfig" command and note down the ip address for each of the machines
- sudo vi /etc/hostname (replace "ubuntu" with new VM name e.g. "edge" or "nn1" or "dn1" or "dn2")
- sudo hostname <VM_Name> (e.g. sudo hostname edge) to change the VM name
- sudo vi /etc/hosts (Comment out the lines for "localhost" and "ubuntu" then add a line for each of the VMs "IPaddress Machine_Name")
- sudo vi /etc/sudoers (add a line at the bottom "<user_id> ALL=(ALL) NOPASSWD: ALL") to provide root previleges for the <user_id>
- Set time and timezone
- sudo apt-get install ntp
- sudo dpkg-reconfigure tzdata
- Restart the macnine (Click on power button on top right >> "shutdown" >> "restart")
- Edge node
- cat ~/.ssh/id_rsa.pub
- Highlight and copy the contents from the cat command above
- Go to each of the other three nodes, vi ~/.ssh/authorized_keys
- Paste the copied contents at the end of the above file & save
- SSH to all the four machines including itself couple of times to make sure you are not prompted for anything. First time you may need to type "yes" in the middle (e.g. ssh nn1, ssh dn1, ssh dn2)
- Download Cloudera manager & run per instructions at the same link
- cd to download dir
- chmod +x cloudera-manager-installer.bin
- sudo ./cloudera-manager-installer.bin
- Follow the instructions
- Open a browser and go to http://localhost:7180
- Login with "admin" and "admin"
- Start the install; Go with the user ID that you started off when you created the VMs
- Enter the ip addresses of all the four (including the edge node itself)
- Continue the installation.
- Name node
- cat ~/.ssh/id_rsa.pub
- Highlight and copy the contents from the cat command above
- Go to each of the other three nodes, vi ~/.ssh/authorized_keys
- Paste the copied contents at the end of the above file & save
- SSH to all the four machines including itself couple of times to make sure you are not prompted for anything. First time you may need to type "yes" in the middle (e.g. ssh nn1, ssh dn1, ssh dn2)
- Data node 1
- cat ~/.ssh/id_rsa.pub
- Highlight and copy the contents from the cat command above
- Go to each of the other three nodes, vi ~/.ssh/authorized_keys
- Paste the copied contents at the end of the above file & save
- SSH to all the four machines including itself couple of times to make sure you are not prompted for anything. First time you may need to type "yes" in the middle (e.g. ssh nn1, ssh dn1, ssh dn2)
- Data node 2
- cat ~/.ssh/id_rsa.pub
- Highlight and copy the contents from the cat command above
- Go to each of the other three nodes, vi ~/.ssh/authorized_keys
- Paste the copied contents at the end of the above file & save
- SSH to all the four machines including itself couple of times to make sure you are not prompted for anything. First time you may need to type "yes" in the middle (e.g. ssh nn1, ssh dn1, ssh dn2)
Thursday, December 19, 2013
Model thinking
Course by Scott E Page. Models help solve problems by physicalizing abstract or complex things into something you can tweak and play around with. With models you can introduce and analyze various parameters that influence the result one by one. Models help explain whey things happened one way (why rich get richer) or help in coming up equations / rules so as to easily solve (predict whats going to happen next) when parameters change.
Productivity is highest during wars. Why? People are focussed and produce more. A manager must introduce problems periodically so that the team will produce more and be more resilient (e.g. controlled forrest fires).
Some problems become very easy to solve once you change the way you represent the problem (e.g. cartesian coordinations vs polar or sum three to 15 puzzle).
Many models are better than one. It helps because one gets stuck.
Do higher level work and delegate lower level work to machines. Thats the technology advantage that will keep the sustained growth rate (technology advantage). New innovation (new skills) is the way to keep increasing your salary (output). Typically people have rapid rise in salary in the early years and it flattens out (like countries with high growth peters out after a while unless you innovate)
Data points are given. Now find some insight from that. Build models. Predict whats going to happen next.
Productivity is highest during wars. Why? People are focussed and produce more. A manager must introduce problems periodically so that the team will produce more and be more resilient (e.g. controlled forrest fires).
Some problems become very easy to solve once you change the way you represent the problem (e.g. cartesian coordinations vs polar or sum three to 15 puzzle).
Many models are better than one. It helps because one gets stuck.
Do higher level work and delegate lower level work to machines. Thats the technology advantage that will keep the sustained growth rate (technology advantage). New innovation (new skills) is the way to keep increasing your salary (output). Typically people have rapid rise in salary in the early years and it flattens out (like countries with high growth peters out after a while unless you innovate)
Data points are given. Now find some insight from that. Build models. Predict whats going to happen next.
Wednesday, December 18, 2013
Monday, December 9, 2013
Friday, December 6, 2013
Saturday, November 30, 2013
Friday, November 29, 2013
Saturday, March 9, 2013
Startup = Growth
Startup = Growth:
Are you growing consistently every week? If not, you are not a start up.
'via Blog this'
Are you growing consistently every week? If not, you are not a start up.
'via Blog this'
How to Get Startup Ideas
How to Get Startup Ideas: "When you have an idea for a startup, ask yourself: who wants this right now? Who wants this so much that they'll use it even when it's a crappy version one made by a two-person startup they've never heard of? If you can't answer that, the idea is probably bad."
'via Blog this'
'via Blog this'
Monday, January 21, 2013
Web 2.0 - Wikipedia, the free encyclopedia
Web 3.0 - Wikipedia, the free encyclopedia:
Semantic Web and personalization. first-generation Metaverse (convergence of the virtual and physical world), a web development layer that includes TV-quality open video, 3D simulations, augmented reality, human-constructed semantic standards, and pervasive broadband, wireless, and sensors. Web 3.0's early geosocial (Foursquare, etc.) and augmented reality (Layar, etc.) webs are an extension of Web 2.0's participatory technologies and social networks (Facebook, etc.) into 3D space.
'via Blog this'
Semantic Web and personalization. first-generation Metaverse (convergence of the virtual and physical world), a web development layer that includes TV-quality open video, 3D simulations, augmented reality, human-constructed semantic standards, and pervasive broadband, wireless, and sensors. Web 3.0's early geosocial (Foursquare, etc.) and augmented reality (Layar, etc.) webs are an extension of Web 2.0's participatory technologies and social networks (Facebook, etc.) into 3D space.
'via Blog this'
Sunday, January 20, 2013
Saturday, January 19, 2013
Thursday, January 17, 2013
Big Data Means Big IT Job Opportunities -- for the Right People - CIO.com
Big Data Means Big IT Job Opportunities -- for the Right People - CIO.com:
data scientist, data architect, data visualizer and data change agent.
math, statistics, data analysis, business analytics and even natural language processing.
require knowledge of programming and the ability to develop applications, as well as an understanding of how to meet business needs.
They have to take ideas from one field and apply them to another field, and they have to be comfortable with ambiguity."
In short, big data folks seem to be jacks of all trades and masters of none, and their greatest skill may be the ability to serve as the "glue" in an organization. "You can take someone who maybe is not the world's greatest software engineer [nor] the world's greatest statistician, but they have the communications skills to talk to people on both sides" as well as to the marketing team and C-level executives.
'via Blog this'
data scientist, data architect, data visualizer and data change agent.
math, statistics, data analysis, business analytics and even natural language processing.
require knowledge of programming and the ability to develop applications, as well as an understanding of how to meet business needs.
They have to take ideas from one field and apply them to another field, and they have to be comfortable with ambiguity."
In short, big data folks seem to be jacks of all trades and masters of none, and their greatest skill may be the ability to serve as the "glue" in an organization. "You can take someone who maybe is not the world's greatest software engineer [nor] the world's greatest statistician, but they have the communications skills to talk to people on both sides" as well as to the marketing team and C-level executives.
'via Blog this'
The 3 Puzzle Pieces That Shape Your Career Path | LinkedIn
The 3 Puzzle Pieces That Shape Your Career Path | LinkedIn:
Career: Your assets (skills, network, cash balance), Aspirations and Values (thinking big), Market realities (what does market want) -- Reid Hoffman
'via Blog this'
Career: Your assets (skills, network, cash balance), Aspirations and Values (thinking big), Market realities (what does market want) -- Reid Hoffman
'via Blog this'
Tuesday, January 15, 2013
Amazon.com: The Start-up of You: Adapt to the Future, Invest in Yourself, and Transform Your Career (9780307888907): Reid Hoffman, Ben Casnocha: Books
Amazon.com: The Start-up of You: Adapt to the Future, Invest in Yourself, and Transform Your Career (9780307888907): Reid Hoffman, Ben Casnocha: Books: ""Being an entrepreneur isn’t really about starting a business. It’s a way of looking at the world: seeing opportunity where others see obstacles, taking risks when others take refuge."
'via Blog this'
'via Blog this'
Friday, January 11, 2013
Subscribe to:
Posts (Atom)