Microsoft Excel – Clean function – calling functions from VBS – Accessing files from VBS on Mac

Today I have been working on importing a text file into Excel and using VBS to do some processing and transformation of it.  In the process I found 3 things worth sharing/commenting on:

1) You may be familiar with the “trim()” function in Excel and in VBS.  It removes extra white spaces in a string.  But did you know that there is a “clean()” function in Excel that will remove “non-printable” characters.  There is a good write up of it at www.globaliconnect.com/excel/index.php?option=com_content&view=article&id=118:excel-text-and-string-functions-trim-a-clean&catid=78&Itemid=474

 

2) There is no “clean()” function in VBS.  But you can call all of the Excel functions by using “Application.WorksheetFunction” object.  Therefore you can call the “clean()” function from VBS by using “Application.WorksheetFunction.clean()”

 

3) File paths from VBS in Mac are not what you expect.  You can not just put in the /path/to/the/file.txt and expect for it to work.  Still a little more work to figure this one out.

 

Wikipedia Paragraphs to Google Sheet Columns

I am in the process of evaluating some Robotic Simulation software.  The first part of the process is to create a list of software to evaluate.  Wikipedia offers a good starting point with a list of package names, urls, and descriptions.  But what I want to do is to put this in a Google Sheet and add additional columns to it.

WikipediaListToGoogleSheet

 

Method 1: Copy & Paste

The first method that you might consider is copying and pasting the various components into a Google sheet.  There are 34 items in the list.  Each item contains 3 parts: package name, URL, and description.  Assuming you could copy and paste each component in less than 5 seconds, that would mean about 15 seconds per item.  all told 15 * 34 = 610 seconds.  This means about 10 minutes of work.

 

Method 2: Regular Expressions

Regular expressions allow for you to match pattern of strings and then to manipulate those patterns.  This allows for you to take a task above that would take 10 minutes of steady, solid, error prone work.  But there is a method that could take that same work and turn it into less than a minute of effort.  After which you will feel smart and could use the other 9 minutes to write a blog post like this.

 

Step 1:  Click on the Edit tab in Wikipedia:  That will let you see the Wiki Markup that looks like this:

* [http://www.parallemic.org/RoKiSim.html RoKiSim]: A free educational software for the 3D simulation of six-axis PUMA-type serial robots.
* [http://www.RoboAnalyzer.com RoboAnalyzer]: A free educational software to teach Robotics Concepts and Virtual Simulation (Joint Jogging and Cartesian Motion) using CAD models of Industrial Robots.

 

Step 2:  Copy and paste the wiki markup that represents the list of items and put it in NotePad++.

 

Step 3:  Identify what you are looking for and how you want to transform it.

You can see that each list item starts with a *.  The link and package name are in “[ ]”  And there is a semicolon that you want to get rid of.

 

 

Find this:

 * [http://www.parallemic.org/RoKiSim.html RoKiSim]: A free educational software for the 3D simulation of six-axis PUMA-type serial robots.

And replace it with (the extra spaces represent tabs… Tabs allow for us to put the information into different columns in Google Sheets):

 http://www.parallemic.org/RoKiSim.html    RoKiSim    A free educational software for the 3D simulation of six-axis PUMA-type serial robots.

 

Step 4: Determine the Regular Expression that you will need to search for and the replace

 

Find: \*\[(.*) (.*)\]

Replace: \1\t\2\t

 

Here is what your NotePadd++ replace dialog box should look like.  Click “Replace All”.

Screenshot 2014-11-24 10.31.30

 

Step 5:  Copy and Paste the results into Google Sheets and you will have columns containing the URL, Name, and description.  You can then move around the columns into the order you want.

 

Figuring out the regular expression is going to be the most difficult part of this process.  It is a skill that is not widely mastered, and looks mysterious.  But it would be better to spending more time to master this skill and eventually be able to quickly pull this together quickly, than to waste that time doing copy and paste.