Better JavaScript memory management and HTML prefetching tags could make Web sites more responsive

In the fiercely competitive world of Internet services, Google constantly seeks ways to speed up the delivery of content to its hundreds of millions of users.

At the O’Reilly Velocity conference this week in New York, two Google engineers presented some of their favorite tips and research for expediting delivery of Web pages and applications. Such knowledge could be handy for other Web developers looking to make their products more responsive.

[ Learn how to work smarter, not harder with InfoWorld’s roundup of all the tips and trends programmers need to know in the Developers’ Survival Guide. Download the PDF today! | Keep up with the latest developer news with InfoWorld’s Developer World newsletter. ]

Google developer advocate and performance expert Colt McAnlis tackled one of the thorniest problems for mobile Web developers today: JavaScript performance.

Web based JavaScript applications can suffer from performance issues, especially on mobile clients, because JavaScript parsing engines use GC (garbage collection) to manage memory. “You shouldn’t rely on garbage collectors,” McAnlis told the audience of Web developers.

GC helps programmers by automatically returning to the operating system the memory a program no longer needs. Writing code to manage memory in low level languages such as C and C++ is a laborious process, though, and such languages aren’t natively supported by browsers anyway.

The problem with many JavaScript Web applications is that JavaScript engines will launch their garbage collection routines at seemingly random times, which will cause applications to momentarily slow down. The frame rate of a video application, for instance, may decrease. Or the time it takes an application to execute an operation may jump to a noticeable 20 milliseconds, up from a typical 3 to 5 milliseconds.

Overall, for GC to work without being noticed by the user, the system memory must be six times as large as the amount of memory being used, said McAnlis, referring to a well known study. This can be a demanding requirement given the limited memory of mobile devices and the number of memory-hungry applications they run.

Add to this issue the increasing use of closures, a programmer-friendly technique of widening the availability of locally defined variables. jQuery, for instance, is a widely used JavaScript library that relies on closures and as a result, creates a lot of splurges in memory allocation.

“Closures scare me,” McAnlis said, referring to how unpredictable they can be in terms of the amount of memory they can consume.

To improve performance, and better manage memory, developers should use an approach similar to the one used by the middleware library Emscripten, which is being used to build high performance HTML5 Web games.

 

Advertisements

Encoding Vs Encryption

Posted: 12/10/2013 in ASP.NET

 Encoding and Encryption are somewhat similar concept but these are different in purpose. Encoding means transforming data to another format with schemes that are publicly available like base64 encoding scheme. Later on, anyone can decode to convert it back to original format. Purpose of encoding might be compressing to save memory or confirming the transfer of data over a channel etc. In case of Encryption, transformation of data to another format is done with the purpose of security, so that not everyone can read data except the ones having decryption key/password etc. Encryption id done using some specific key or password.

Online Quiz Coding

Posted: 08/09/2013 in ASP.NET

Introduction

The idea to write an online quiz came up when I was looking for an XML tutorial. I visited W3Schools.com and found not only the tutorial I was looking for, but more interestingly was an online quiz. I took the 20-questions quiz once, and hey! I felt great about it. I wonder why there are just few web sites offering online quiz like that.

A quiz is a great way to test your knowledge. An online quiz is a great addition to your web site that could keep your visitors glued for a few more minutes.

Download the demo project and try it. It is a 10-question quiz to challenge your knowledge about Australian geography. Don’t worry! All data is kept in a clear, human-readable XML document, so you could easily peek for answers.

The Script Explained

I will not go through a very detail discussion about the script, but rather highlight several areas in the script. Once you get the whole idea about the script, then it is easy to modify or extend the script to suit your requirements.

Recursive Script

There is only one aspx script to do various tasks in the online quiz. It is a recursive script: a script that ‘calls’ itself over and over again until a certain condition is reached. Precisely, the script does not call itself, but posts form data to itself. This process is known as post back.

As the script posts back to itself continuously over the duration of the quiz, we could say that it has many states. The first state is to initialize several essential variables, count the total question, and record the quiz start time. Then, in the first and each following state, the script displays a multiple choice question to challenge user (see the snapshot above). A user answering the question will trigger onClick event, forcing a post-back, and move the script to the next state. In the next state, the script will run a subroutine associated with the event to check the answer and display the next multiple question. The recursive flow repeats again and again, until the last question is processed, where at this point a result is displayed.

The following activity diagram represents the recursive flow of the online quiz script.

Maintaining State

The online quiz script needs to maintain state of its variables. There are a bunch of alternatives to do so. The most advanced way is to use the session object, and the conventional way is to use hidden inputs or a QueryString. ASP.NET introduces another alternative called ‘state bag’. Unlike the session object, state bag is not persisted over the whole user session, but is brought forward from one page to another page just like hidden inputs and querystring. However, it is superior to hidden inputs and QueryStrings, since it can accept more data types and the content has been encoded into a single string and therefore is not easy to tamper with.

Storing value into state bag:

 Collapse | Copy Code
ViewState("TotalQuestion") = intTotalQuestion

Getting value from state bag:

 Collapse | Copy Code
intTotalQuestion = ViewState("TotalQuestion")

The following is a list of variables to be kept in the state bag:

Variable Name State Bag Name Data Type Comments
intTotalQuestion TotalQuestion int Keeps the total question in the quiz. The value is populated in the first state of the quiz and remains constants over the duration of quiz.
intScore Score int Keeps the number of correct answer.
intQuestionNo QuestionNo int Holds the last question number the user attempted.
arrAnswerHistory AnswerHistory arraylist of int Records answers in the quiz. It will record 0 (zero) if the answer is correct, otherwise record the selectedindex of the radio buttons.
(none) CorrectAnswer int Holds the correct answer of previous question. It is made available in the next state when the answer is checked for correctness.
(none) StartTime date Holds the start time of the quiz. It is used to calculate the time spent in the quiz.

 

XML Data

Data for the online quiz is kept in an XML document named quiz.xml, which is validated using an XML schema namedquiz.xsd. A valid XML document consists of a root element called quiz, which has at least one element called mchoice(short for multiple-choice). Each mchoice element has one question child element, and two or more answer child elements. The answer element may have the correct attribute with possible value of either yes or no. In fact, you should supply the correct attribute with a value of yes to one of the answers in the same mchoice, otherwise there will be no correct answer for the question.

quiz.xml:

 Collapse | Copy Code
<?xml version="1.0" encoding="UTF-8"?>
<quiz xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="quiz.xsd">
  <mchoice>
    <question>What is the capital city of Australia ?</question>
    <answer>Sydney</answer>
    <answer correct="yes">Canberra</answer>
    <answer>Melbourne</answer>
    <answer>Gold Coast</answer>
  </mchoice>
  <mchoice>
    <question>Which city has an extensive tram network?</question>
    <answer>Sydney</answer>
    <answer correct="yes">Melbourne</answer>
    <answer>Adelaide</answer>
    <answer>Ballarat</answer>
  </mchoice>
</quiz>

It is possible to insert HTML tags within the XML data, therefore the quiz may contain decorated texts, images, links, etc. instead of plain text. Just make sure to enclose the HTML tags with CDATA block, so the XML document is still valid. Look at the following example:

 Collapse | Copy Code
  <mchoice>
    <question><![CDATA[<span>Which of the following is <u>NOT</u> 
Australian native animals?]]></span></question>
    <answer>Kangaroo</answer>
    <answer correct="yes">Penguin</answer>
    <answer>Koala</answer>
    <answer>Wombat</answer>
  </mchoice>

quiz.xsd:

 

 Collapse | Copy Code
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified" attributeFormDefault="unqualified">
  <xs:element name="quiz">
    <xs:complexType>
      <xs:choice>
        <xs:element name="mchoice" maxOccurs="unbounded">
          <xs:complexType>
            <xs:sequence>
              <xs:element name="question" type="xs:string"/>
              <xs:element name="answer" minOccurs="2" maxOccurs="unbounded">
                <xs:complexType>
                  <xs:simpleContent>
                    <xs:extension base="xs:string">
                      <xs:attribute name="correct" use="optional">
                        <xs:simpleType>
                          <xs:restriction base="xs:string">
                            <xs:enumeration value="yes"/>
                            <xs:enumeration value="no"/>
                          </xs:restriction>
                        </xs:simpleType>
                      </xs:attribute>
                    </xs:extension>
                  </xs:simpleContent>
                </xs:complexType>
              </xs:element>
            </xs:sequence>
          </xs:complexType>
        </xs:element>
      </xs:choice>
    </xs:complexType>
  </xs:element>
</xs:schema>

The online quiz script does not validate the XML document against the XML Schema for several reasons. First, it is a resource intensive process, forcing XMLTextReader to go through each element and attribute in the XML document. Second, we need to validate the XML document just once after it has been updated, instead of every time we load the file. To validate the XML document, you could write a separate aspx script or use a third-party tool and run them manually every after you finish updating the XML document.

XML Document Object Model

XML Document Object Model (DOM) is a tree-like structure that represents every node of an XML document based on the hierarchical relationship with its parent nodes and child nodes. The DOM allows us to navigate and manipulate XML document in more logical way.

To build an XML DOM, we use the XMLDocument class. The XMLDocument class itself extends the XMLNode class, therefore many of its properties and methods are inherited from XMLNode. While XMLNode’s methods and properties apply to a specific node in an XML document, methods and properties of XMLDocument apply to the whole XML document.

The following code create an instance of XMLDocument and build XML DOM from quiz.xml:

 Collapse | Copy Code
Dim xDoc as XMLDocument = New XMLDocument()
xDoc.Load(Server.MapPath("quiz.xml"))

Addressing a specific node in the XML DOM is a bit tricky. We should navigate our ‘pointer’ through the DOM starting from its root node. The following code demonstrates how to address the first question of the first multiple choice:

 Collapse | Copy Code
Dim xNode as XMLNode

'Goto the first/base element in the file, which is <?xml ?>
xNode = xDoc.DocumentElement

'Goto the next sibling of current node, which is <quiz>
xNode = xNode.NextSibling

'Goto the first child of current node, which is <mchoice>
xNode = xNode.FirstChild

'Goto the first child of current node, which is <quiz>
xNode = xNode.FirstChild

'Print the content of current node
Response.Write("The question is: " & xNode.InnerHtml)

It is definitely a tedious task, particularly if you want to address nodes located at a very low level in the hierarchy. Luckily, we can utilize the XPath language to more directly address a specific node or a group of nodes. If you are unfamiliar with XPath, it will be briefly explained in the next section.

The SelectNodes method of the XMLNode and XMLDocument class’ accepts an XPath string and returns a collection of XMLNode objects, called XMLNodeList. Another method, SelectSingleNode, does just the same thing but return only a single XMLNode object.

 Collapse | Copy Code
'Using SelectNodes to address all answers of the first multiple choice
Dim xNodeList As XMLNodeList
Dim i as Integer
xNodeList = xDoc.SelectNodes("/quiz/mchoice[1]/answer")
For i = 0 to xNodeList.Count-1
  Response.Write("<p>" & xNodeList.Item(i).InnerHtml)
Next

'Using SelectSingleNode to select the first question of the first multiple choice
Dim xNode as XMLNode
xNode = xDoc.SelectSingleNode("/quiz/mchoice[1]/question")
Response.Write("The question is: " & xNode.InnerHtml)

XPath

XPath is a language to address specific nodes in an XML document. It could address a single node or a group of nodes by describing its hierarchy relationship in a string, therefore it is often called an XPath string. If you are familiar with file path or URL, then the concept of XPath is nothing new.

Read the XPath string from left to right and you will be able to figure out the node it is addressing. Except for several conditions and functions, XPath is actually easy to use. The following table demonstrates some usage of XPath againstquiz.xml:

XPath String Result
/quiz Select the root node of XML including all elements it contains.
/quiz/mchoice Select all mchoice child elements of the quiz
/quiz/mchoice[1] Select the first mchoice (multiple choice) child element of the quiz
/quiz/mchoice[1]/question Select all questions of the first multiple choice of the quiz
/quiz/mchoice[1]/answer[4] Select the fourth answer of the first multiple choice of the quiz
/quiz/mchoice[1]/answer[4]/@correct Select ‘correct’ attribute of the fourth answer of the first multiple choice of the quiz

XPath contains a lot more surprises. If you are interested enough to explore more, check out the XPath Tutorial at W3CSchools.

Conclusions

This article presented an online quiz as a tool to add interactivity to your web site. It also explored several topics like recursive scripts, navigating and manipulating XML documents using the XML DOM, a glimpse of the XPath language, and a brief discussion about state maintenance.

 

Author:
Enrico Elizar Samuel

Web Developer
Singapore Singapore

If you’re responsible for monitoring Twitter for conversations about your brand, you’re faced with a challenge: You need to know what people are saying about your brand at all times AND you don’t want to live your entire life in front of Twitter Search.

Over the years, a number of social media applications have been released specifically for brand managers and social media teams, but most of those applications (especially the free/inexpensive ones) differentiate themselves only by the quality of their analytics and how real-time their data is reported. If that’s what you need, you have plenty of fantastic options. Those differentiators don’t really help you if you want to take a more passive role in monitoring Twitter search … You still have to log into the application to see your fancy dashboards with all of the information. Why can’t the data come to you?

About three weeks ago, Hazzy stopped by my desk and asked if I’d help build a tool that uses the Twitter Search API to collect brand keywords mentions and send an email alert with those mentions in digest form every 30 minutes. The social media team had been using Twilert for these types of alerts since February 2012, but over the last few months, messages have been delayed due to issues connecting to Twitter search … It seems that the service is so popular that it hits Twitter’s limits on API calls. An email digest scheduled to be sent every thirty minutes ends up going out ten hours late, and ten hours is an eternity in social media time. We needed something a little more timely and reliable, so I got to work on a simple “Twitter Monitor” script to find all mentions of our keyword(s) on Twitter, email those results in a simple digest format, and repeat the process every 30 minutes when new mentions are found.

With Bear’s Python-Twitter library on GitHub, connecting to the Twitter API is a breeze. Why did we use Bear’s library in particular? Just look at his profile picture. Yeah … ’nuff said. So with that Python wrapper to the Twitter API in place, I just had to figure out how to use the tools Twitter provided to get the job done. For the most part, the process was very clear, and Twitter actually made querying the search service much easier than we expected. The Search API finds all mentions of whatever string of characters you designate, so instead of creating an elaborate Boolean search for “SoftLayer OR #SoftLayer OR @SoftLayer …” or any number of combinations of arbitrary strings, we could simply search for “SoftLayer” and have all of those results included. If you want to see only @ replies or hashtags, you can limit your search to those alone, but because “SoftLayer” isn’t a word that gets thrown around much without referencing us, we wanted to see every instance. This is the code we ended up working with for the search functionality:

def status_by_search(search):
    statuses = api.GetSearch(term=search)
    results = filter(lambda x: x.id > get_log_value(), statuses)
    returns = []
    if len(results) > 0:
        for result in results:
            returns.append(format_status(result))

        new_tweets(results)
        return returns, len(returns)
    else:
        exit()

If you walk through the script, you’ll notice that we want to return only unseen Tweets to our email recipients. Shortly after got the Twitter Monitor up and running, we noticed how easy it would be to get spammed with the same messages every time the script ran, so we had to filter our results accordingly. Twitter’s API allows you to request tweets with a Tweet ID greater than one that you specify, however when I tried designating that “oldest” Tweet ID, we had mixed results … Whether due to my ignorance or a fault in the implementation, we were getting fewer results than we should. Tweet IDs are unique and numerically sequential, so they can be relied upon as much as datetime (and far easier to boot), so I decided to use the highest Tweet ID from each batch of processed messages to filter the next set of results. The script stores that Tweet ID and uses a little bit of logic to determine which Tweets are newer than the last Tweet reported.

def new_tweets(results):
    if get_log_value() < max(result.id for result in results):
        set_log_value(max(result.id for result in results))
        return True

def get_log_value():
    with open('tweet.id', 'r') as f:
        return int(f.read())

def set_log_value(messageId):
    with open('tweet.id', 'w+') as f:
        f.write(str(messageId))

Once we culled out our new Tweets, we needed our script to email those results to our social media team. Luckily, we didn’t have to reinvent the wheel here, and we added a few lines that enabled us to send an HTML-formatted email over any SMTP server. One of the downsides of the script is that login credentials for your SMTP server are stored in plaintext, so if you can come up with another alternative that adds a layer of security to those credentials (or lets you send with different kinds of credentials) we’d love for you to share it.

From that point, we could run the script manually from the server (or a laptop for that matter), and an email digest would be sent with new Tweets. Because we wanted to automate that process, I added a cron job that would run the script at the desired interval. As a bonus, if the script doesn’t find any new Tweets since the last time it was run, it doesn’t send an email, so you won’t get spammed by “0 Results” messages overnight.

The script has been in action for a couple of weeks now, and it has gotten our social media team’s seal of approval. We’ve added a few features here and there (like adding the number of Tweets in an email to the email’s subject line), and I’ve enlisted the help of Kevin Landreth to clean up the code a little. Now, we’re ready to share the SoftLayer Twitter Monitor script with the world via GitHub!

SoftLayer Twitter Monitor on GitHub

The script should work well right out of the box in any Python environment with the required libraries after a few simple configuration changes:

  • Get your Twitter Customer Secret, Access Token and Access Secret fromhttps://dev.twitter.com/
  • Copy/paste that information where noted in the script.
  • Update your search term(s).
  • Enter your mailserver address and port.
  • Enter your email account credentials if you aren’t working with an open relay.
  • Set the self.from_ and self.to values to your preference.
  • Ensure all of the Python requirements are met.
  • Configure a cron job to run the script your desired interval. For example, if you want to send emails every 10 minutes: */10 * * * * <path to python> <path to script> 2>&1 /dev/null

As soon as you add your information, you should be in business. You’ll have an in-house Twitter Monitor that delivers a simple email digest of your new Twitter mentions at whatever interval you specify!

Like any good open source project, we want the community’s feedback on how it can be improved or other features we could incorporate. This script uses the Search API, but we’re also starting to play around with the Stream API and SoftLayer Message Queue to make some even cooler tools to automate brand monitoring on Twitter.

Breaking Down ‘Big Data’

Posted: 01/08/2013 in Web Tech

Forester defines big data as “techniques and technologies that make capturing value from data at an extreme scale economical.” Gartner says, “Big data is the term adopted by the market to describe extreme information management and processing issues which exceed the capability of traditional information technology along one or multiple dimensions to support the use of the information assets.” Big data demands extreme horizontal scale that traditional IT management can’t handle, and it’s not a challenge exclusive to the Facebooks, Twitters and Tumblrs of the world … Just look at the Google search volume for “big data” over the past eight years:

Big Data Search Interest

Developers are collectively facing information overload. As storage has become more and more affordable, it’s easier to justify collecting and saving more data. Users are more comfortable with creating and sharing content, and we’re able to track, log and index metrics and activity that previously would have been deleted in consideration of space restraints or cost. As the information age progresses, we are collecting more and more data at an ever-accelerating pace, and we’re sharing that data at an incredible rate.

To understand the different facets of this increased usage and demand, Gartner came up with the three V’s of big data that vary significantly from traditional data requirements: Volume, Velocity and Variety. Larger, more abundant pieces of data (“Volume”) are coming at a much faster speed (“Velocity”) in formats like media and walls of text that don’t easily fit into a column-and-row database structure (“Variety”). Given those equally important factors, many of the biggest players in the IT world have been hard at work to create solutions that provide the scale and speed developers need when they build social, analytics, gaming, financial or medical apps with large data sets.

When we talk about scaling databases here, we’re talking about scaling horizontally across multiple servers rather than scaling vertically by upgrading a single server — adding more RAM, increasing HDD capacity, etc. It’s important to make that distinction because it leads to a unique challenge shared by all distributed computer systems: The CAP Theorem. According to the CAP theorem, a distributed storage system must choose to sacrifice either consistency (that everyone sees the same data) oravailability (that you can always read/write) while having partition tolerance (where the system continues to operate despite arbitrary message loss or failure of part of the system occurs).

Let’s take a look at a few of the most common database models, what their strengths are, and how they handle the CAP theorem compromise of consistency v. availability:

Relational Databases

What They Do: Stores data in rows/columns. Parent-child records can be joined remotely on the server. Provides speed over scale. Some capacity for vertical scaling, poor capacity for horizontal scaling. This type of database is where most people start.
Horizontal Scaling: In a relational database system, horizontal scaling is possible via replication — dharing data between redundant nodes to ensure consistency — and some people have success sharding — horizontal partitioning of data — but those techniques add a lot of complexity.
CAP Balance: Prefer consistency over availability.
When to use: When you have highly structured data, and you know what you’ll be storing. Great when production queries will be predictable.
Example Products: OracleSQLitePostgreSQLMySQL

Document-Oriented Databases

What They Do: Stores data in documents. Parent-child records can be stored in the same document and returned in a single fetch operation with no join. The server is aware of the fields stored within a document, can query on them, and return their properties selectively.
Horizontal Scaling: Horizontal scaling is provided via replication, or replication + sharding. Document-oriented databases also usually support relatively low-performance MapReduce for ad-hoc querying.
CAP Balance: Generally prefer consistency over availability
When to Use: When your concept of a “record” has relatively bounded growth, and can store all of its related properties in a single doc.
Example Products: MongoDBCouchDBBigCouchCloudant

Key-Value Stores

What They Do: Stores an arbitrary value at a key. Most can perform simple operations on a single value. Typically, each property of a record must be fetched in multiple trips, with Redis being an exception. Very simple, and very fast.
Horizontal Scaling: Horizontal scale is provided via sharding.
CAP Balance: Generally prefer consistency over availability.
When to Use: Very simple schemas, caching of upstream query results, or extreme speed scenarios (like real-time counters)
Example Products: CouchBaseRedisPostgreSQL HStoreLevelDB

BigTable-Inspired Databases

What They Do: Data put into column-oriented stores inspired by Google’s BigTable paper. It has tunable CAP parameters, and can be adjusted to prefer either consistency or availability. Both are sort of operationally intensive.
Horizontal Scaling: Good speed and very wide horizontal scale capabilities.
CAP Balance: Prefer consistency over availability
When to Use: When you need consistency and write performance that scales past the capabilities of a single machine. Hbase in particular has been used with around 1,000 nodes in production.
Example Products: HbaseCassandra (inspired by both BigTable and Dynamo)

Dynamo-Inspired Databases

What They Do: Distributed key/value stores inspired by Amazon’s Dynamo paper. A key written to a dynamo ring is persisted in several nodes at once before a successful write is reported. Riak also provides a native MapReduce implementation.
Horizontal Scaling: Dynamo-inspired databases usually provide for the best scale and extremely strong data durability.
CAP Balance: Prefer availability over consistency,
When to Use: When the system must always be available for writes and effectively cannot lose data.
Example Products: CassandraRiakBigCouch

Each of the database models has strengths and weaknesses, and there are huge communities that support each of the open source examples I gave in each model. If your database is a bottleneck or you’re not getting the flexibility and scalability you need to handle your application’s volume, velocity and variety of data, start looking at some of these “big data” solutions

@NK Aravind


The internet around the globe has been slowed down in what security experts are describing as the biggest cyber-attack of its kind in history.

“Based on the reported scale of the attack, which was evaluated at 300 Gigabits per second, we can confirm that this is one of the largest DDoS operations to date,” online security firm Kaspersky Lab said in a statement, “There may be further disruptions on a larger scale as the attack escalates.”

It is having an impact on popular services like Netflix – and experts worry it could escalate to affect banking and email systems.

Spamhaus, a group based in both London and Geneva, is a non-profit organisation that aims to help email providers filter out spam and other unwanted content.

To do this, the group maintains a number of blocklists – a database of servers known to be being used for malicious purposes.

Recently, Spamhaus blocked servers maintained by Cyberbunker, a Dutch web host that states it will host anything with the exception of child pornography or terrorism-related material.

Spamhaus said it was able to cope as it has highly distributed infrastructure and technology in a number of countries.


Big Data is likely to have a major impact on our world – amounting in a significant technological shift over the next data. Don’t believe us? Watch this video to find out some real world examples of how Big Data is already making an impact.
Thanks to IBM ..Great Success

Video  —  Posted: 21/07/2013 in Web Tech