Optimizing Labs 2 Palabras Architecture for Effective Data Processing

labs 2 palabras n.w
1 / 12
Embed
Share

"Learn the step-by-step process to set up and run the Palabras Architecture for efficient data processing. From login to running locally and preparing the master and slaves, follow the guidelines provided to maximize performance. With detailed instructions and visual aids, this guide ensures a smooth implementation of the Palabras system for your data needs."

  • Palabras
  • Architecture
  • Data Processing
  • Optimization
  • Efficiency

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Labs 2: Palabras

  2. Palabras Archiecture Slave1 Slave2 Slave3 SlaveN Master1 Master2 Master3 MasterN Directory Jobs3 Jobs1 Jobs2 JobsM Slave1 Slave2 Slave3 SlaveM

  3. Step 1: Get Started Login: Username: nombre\cc5212 Password on board http://aidanhogan.com/teaching/cc5212-1/mdp-lab2.zip C:/Program Files (x86)/eclipse/ (in Spanish ) File > Import > http://aidanhogan.com/teaching/cc5212-1/mdp-lab2-data/

  4. Step 2: Run Locally ~600.000 abstracts ~52.340.000 non-unique words ~320 MB uncompressed How long will it take? Will it even run? org.mdp.cli.RunWordCountLocally Right Click > Run As > Run Configurations > Arguments -i <path>/abstracts-es.txt.gz -igz k 500 -Xmx256M

  5. Step 3: Start the Directory I start the directory! vm116.dcc.uchile.cl (172.17.69.190) Port 1985 Remind me to set heap-space

  6. Step 4: Prepare Slave org.mdp.cli.StartWordCountSlave 1. Implement openDirectoryStub() 2. Add the slave s name to the directory 3. Review the other code

  7. Step 5: Run Slave Build the .jar using build.xml(dist) Open cmd and go to directory java jar Xmx256M mdp-2.jar StartWordCountSlave dn vm116.dcc.uchile.cl dp 1985 sn <username>

  8. Step 6: Prepare Master org.mdp.cli.StartWordCountMaster 1. Connect to the directory 2. Get the list of slaves from the directory 3. Clear words from the slave for you 4. Choose a slave for each word 5. Send the add-words job to each slave

  9. Step 7: Run Master For small dataset! org.mdp.cli.StartWordCountMaster Right Click > Run As > Run Configurations > Arguments -i <path>\es-abstracts-10k.txt.gz -igz -dp 1985 -dn vm116.dcc.uchile.cl -mn <username> - k 500

  10. Step 8: Run Big Master For big dataset! org.mdp.cli.StartWordCountMaster Right Click > Run As > Run Configurations > Arguments -i <path>\es-abstracts.txt.gz -igz -dp 1985 -dn vm116.dcc.uchile.cl -mn <username> -k 500

  11. Step 9: Run Distribution Locally 1. Start a directory server Build and use the jar java -jar mdp-2.jar StartRegistryAndServer -n localhost -p 1985 -r -s 1 -sp 2. Start 4 slaves (give different names) in four different CMD windows Use the jar java -jar mdp-2.jar StartSlave -dn localhost - dp 1985 wn <usernameN> 3. Start a master Can use Eclipse or jar (as preferred) Point it to local directory Use small file (large file if successful) -Xmx256M

  12. Final Step: Teach Me Spanish Ask me words in the top 500!

More Related Content