Gentoo Linux Box for Vagrant Parallels Provider

As I explained in an earlier post, Vagrant now supports Parallels as a provider. Since I wanted to test how they were working together, I created a standard 64bit Gentoo Linux box that you can download and use. In addition to a standard Gentoo install, the box also comes with Puppet installed, so you can do some actual work on it.

Since I presume you already have the Parallels provider setup by now, this is how you can download and use the box:

1
vagrant init orpiske/gentoo-linux-64 && vagrant up

After the box is downloaded from the cloud you can use vagrant as usual (ie.: vagrant ssh, etc).

 

Parallels 10 and Ubuntu 14.04 LTS

Just in case you are having problems running Ubuntu 14.04 after upgrading to Parallels Desktop 10. Follow this and then this. It should also resolve any problems during Cinnamon startup as well.

Vagrant and Parallels Desktop

Maybe this is not news anymore, but Vagrant now supports Parallels. It seems to work with Parallels Desktop 8 and above, but I wasn’t able to run it 9 on OS X Yosemite. Upgrading to Parallels Desktop 10 seems to have fixed the issue and it worked like a charm. One additional problem is that there’s a shortage of images in the Vagrant Cloud. Although I believe this will be fixed as the community grows and share more templates on the cloud, this may be an nuisance to some users.

Fun with Grok and Logstash regexes

I have been using Logstash extensively lately. Along with ElasticSearch, it’s a great tool to centralize the logs and simplify access to them. The only difficulty I had was related to supporting multiline log messages, such as those printed by Java stacktraces. I have found some good examples online,  but none seemed to work the way I wanted. In some cases, I also got  my messages tagged as  _grokparsefailure, which indicated that the parser failed to process the regex. I ended up with one that it’s not so different after all but which did match exactly the way we log messages with log4j:

1
(^.+Exception.+)|(^\s+at .+)|(^\s+... \d+ more)|(^\s*Caused by:.+)

It’s also worth mentioning the Grok Debugger website along with an adequate regex tutorial are two priceless resources to have at hand.

 

 

Using Apache Cassandra with Apache Hadoop

I am currently working on a data analytics website for my own educational purposes and to fulfil my hacking/learning needs, I decided to use Apache Cassandra as the input/output storage engine for and Apache Hadoop map/reduce job.

The job in question is as simple as it gets: it reads the data from a table stored in a Cassandra database and identifies what are the most commonly used adjectives for each of the major communication service providers (CSPs) in Brazil. After processing, the results are stored in another table in the same Cassandra database. Basically, it is a fancier version of the famous Hadoop word count example.

Unfortunately, there seem to be a lack of modern documentation about integrating Hadoop and Cassandra. Even the official guide seem to be deficient/outdated about this subject. To add insult to the injury, I also wanted to use composite keys, which complicated things further. After reading the example source code in Cassandra source code, I was able to successfully implement a working job.

Despite the lack of documentation and the hacking required to figure out how to make it work, the process is quite simple and even an unexperienced Cassandra/Hadoop developer such as myself can do it without much trouble. In the paragraphs below you will find additional details about the Hadoop and Cassandra integration and what is required to make it work.

Finally, as it’s usual for my coding examples, the source code is available in my Github account under the open source Apache License v2.

Continue reading ‘Using Apache Cassandra with Apache Hadoop’

Development Goodies

These are just some development-related links and articles I have read in the last weeks which I think are worth mentioning:

Understanding webservices specifications (and more)

We all know that JSON and RESTful web services are the new darlings of the Internet and, to some extent, backend development these days. Their simplicity over other mechanisms are, undoubtedly, a good thing. However, a large amount of the backend development still (will continue to) rely on SOAP and other mechanisms to provide services. That’s why it’s so important to understand them. This series or articles from IBM Developer Works can help you understand them:

On the other hand, if you want to understand the RESTful side of the force, you may want to read about Developing RESTful Services using Apache CXF.

Data Structures

Data structures are a recurring topic for any software engineer: be it because it’s required for pretty much any interview or because you need to find the most adequate solution to a problem you are working with. Nonetheless, there are a vast amount of them and it’s not always easy to remember about them. The list below contains a list of interesting reference material about them.

  • Data Structure Visualizations: contains an animated walk-through through the most used/known data structures. A must see if you are having trouble understanding any of them.
  • Algorithms + Data Structures = Programs: a book about fundamental topics in computer programming.
  • Know Thy Complexities: a Big-O Cheat Sheet at a click of a mouse. Bônus point: it links to the Wikipedia articles about each of the items in the cheat sheet.
  • Core Algorithms Deployed: so you want to know who uses a Radix Tree? Lots and lots of good code showing how they are used in real-life.

And to add some art to the science, algorithms dance:

Bubble Sort:

Quick Sort:

 

MacPass: a decent, OS X native, KeePass port

A native OS X port of Keepass is something that I have been wanting for a long time. Amazingly I found one today while browsing the web. You can download it from here, and look at the source code on the project’s Github.

 

 

Enterprise Integration with Apache Camel

I’ve just published a mini e-book, in Portuguese, about Enterprise Integration with Apache Camel. If you happen to speak Portuguese, you can download it out here.

Next Page »


Categories