Monday, 23 May 2022
Logo
     
Home > Articles > How to do Web Scraping with Ruby by Processing CSV
article_img

How to do Web Scraping with Ruby by Processing CSV

devops rails ruby    over 1 year Ago     314   26   Like

This article is about How to do simple ruby web scraping by processing CSV.

In this article, we will create a Ruby on Rails application to scrap the link uploaded from a CSV file and find the occurrence of the link on a particular page.

In the below Ruby on Rails application development, the user needs to pass a CSV file and list of the user’s email to whom the parsed CSV will be sent.

In the CSV file, there will be 2 columns

  1. referal_link
  2. home_link

Let’s start creating a Rails Application

  • Run the below command to create a new rails application

$ rails new scrape_csv_data
$ cd scrape_csv_data

  • Then, we will generate an Upload CSV module. Run the below command.

$ rails g scaffold UploadCsv generated_csv:string csv_file:string

This will create all the required models, controllers, and migrations for csv_file. Run the migration using the below command. 

$ rails db:migrate

gem 'carrierwave', '~> 2.0'

  Then,

$ bundle install

  • Then we will create the uploader in careerwave using the below command.

$ rails generate uploader Avatar

  • We will attach the uploader in the model app/models/upload_csv.rb. 

class UploadCsv < ApplicationRecord
     mount_uploader :csv_file, AvatarUploader
end

  • Update the routes.

Rails.application.routes.draw do
     resources :upload_csvs
     root 'upload_csvs#index'
end

  • Then, we will start the server and check if the application is working successfully or not. 

$ rails s

  • Then we will create a job to read the CSV file and scrap the link from it and the generated file will be stored in the generated_csv column of that record for generating the job. Run the below command.

$ rails generate job genrate_csv

  • Add the below gem and run bundle install

gem 'httparty'
gem 'nokogiri'

  • Then we will replace the below code in the GenerateCsv job.
  • Then we will run the job after_create of upload_csvs and we will add the validation for the csv_file required. 
  • Now update the code of app/models/upload_csv.rb.

After uploading the file check the scrap generated file will be updated. You can check the generated file in /scrape_data/public/result_data.csv

  • Now we will send the generated file through email by using the below instructions.

First,  we will generate the mailer by using the below command.

$ rails generate mailer NotificationMailer

Add this code inside the notification mailer. 

Also, we need to add mail configuration inside config/environments/development.rb or production.rb.

Also, we need to update the view also app/views/notification_mailer/send_csv.html.erb

app/views/notification_mailer/send_csv.html.erb

Thank you!

This Site is all about collection of best resources

Users able to write own articles or share the resources they know

If you found any copy right issues, kindly CONTACT US. will take Immediate Action.
Subscribe To Us

Busy At Work?? Not Having Time To Know Whats Happening In Ruby World??

We will Send You Weekly Notifications About News, Jobs, Articls, Conferences etc..

Subscribe Now