Monday, March 29, 2010

Semantic web - Web 3.0

The semantic web is a vision of information that is understood by computers(machine), so that they can perform more of the tedious work involved in finding, combining, and acting upon information on the web.

At present, only humans are capable of using the Web to carry out tedious tasks. However, a computer cannot accomplish the same tasks without human direction because web pages are designed to be read by human, not machines.

Tim Burner Lee consider the semantic web as the next web 3.0.

W3C and many others are rushing to come out with design principals and standards.

Some of these include Resource Description Framework (RDF), a variety of data interchange formats (e.g. RDF/XML, N3, Turtle, N-Triples), and notations such as RDF Schema (RDFS) and the Web Ontology Language (OWL), all of which are intended to provide a formal description of concepts, terms, and relationships within a given knowledge domain.

Tools used in the making of facebook

Facebook is the #1 social-networking site in the world now. Did you wonder what are the tools that made facebook so successful. Beside people and data, which are the key ingredient that help to make facebook so yummy, there lies all those utensils that make it happen. Today, I shall focus on some tools that are use  for Facebook. Most of the description here are extracted from facebook developer site or developer wiki sites.


Scribe - An open source server for aggregating log data streamed in real time from a large number of servers. It is designed to be scalable, extensible without client-side modification, and robust to failure of the network or any specific machine.

Scribe server running on every node in the system will aggregate messages and send them to a central scribe server. If the central scribe server isn’t available the local scribe server writes the messages to a file on local disk, and sends them when the central server recovers. The central server can write the messages to the files that are their final destination, typically on an nfs filer or a distributed filesystem, or it could also send the messages to another layer of scribe servers.

Clients log entries consists of two strings- a category and a message.

Facebook and Twitter is currently using scribe.

Source code at http://github.com/facebook/scribe



Thrift s a software framework(libraries) for scalable cross-language services development. It combines a powerful software stack with a code generation engine to build services that work efficiently and seamlessly between C++, Java, Python, PHP, and Ruby. Thrift allows you to define data types and service interfaces in a simple definition file. Taking that file as input, the compiler generates code to be used to easily build RPC clients and servers that communicate seamlessly across programming languages.

http://wiki.apache.org/thrift/FrontPage


OpenLink Data Spaces (ODS) is a Distributed Collaborative Web Application Platform, Social Network, and Content Management System for creating presence in the semantic web via Data Spaces derived from Weblogs, Wikis, Feed Aggregators, Photo Galleries, Shared Bookmarks, Discussion Forums and more.


Data Spaces are a new database-management technology frontier that deals with the virtualization of heterogeneous data and data sources via a plethora of data-access protocols.

As Unified Data Stores, Data Spaces also provide solid foundation for the creation, processing and dissemination of knowledge, making them a natural foundation platform for the emerging Data-Web (Semantic Web, Layer 1).
 
Data Spaces also provide a cost-effective route for generating Semantic Web Presence from Web 2.0 and traditional Web data-sources, by delivering an atomic data container for RDF Instance Data derived from data hosted in Blogs, Wikis, Shared Bookmark Services, Discussion Forums, Web File Servers, Photo Galleries, etc.
More info please refer to:-

http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/Ods


Facebook Markup Language is a markup language that is used to customize the "look and feel" of applications that developers create.

It is a variant  of HTML with some elements removed. This allows Facebook Application developers to customise the "look and feel" of their applications, and how to encode content so that Facebook's servers can read and publish it.

FBML set by any application is cached by Facebook until a subsequent API call replaces it. Facebook also offers specialised Facebook Javascript (FBJS) and library.

FBML Tags
  • Social data tags
  • Sanitization tags
  • Design tags
  • Component tags
  • Control tags


Facebook Platform wiki site:
http://en.wikipedia.org/wiki/Facebook_Platform


FB Development Tools
Facebook Application Programming Interface (API)
Facebook Query Language (FQL)
Facebook Markup Language (FBML)
Facebook JavaScript (FBJS)
Facebook Connect

Friday, March 26, 2010

Local Cloud, virtualization and more.

Even as cloud computing has been around for quite awhile, most organisation  will likely not to jump straightaway to cloud. Most will approach it first using hybrid method. Moving the least sensitivity and lesser risk apps to cloud while majority applications are still retain in the traditional computing system. It will then evolved in phases. Some organisation will then start to offer to user the computing resources as utility based product. Let's take a look at virtualization as it is the main driver of cloud computing.

Virtualization witin the organisation can be segmented to three types:-
  • Server Virtualization
    • Using VMware, Hyper-V, Zoning
  • Application Virtualization
    • Software technology that separate from underlying OS
  • Presentation Virtualization
    • example is "Thin Client"
Type of server virtualizations
  • Microsoft Hyper-V based on window 2008 server
  • VMware based on linux & window server
    • VMware ESX Server 3i, 3.5
    • VirtualCentre 2.5
 Type of thin client
  • Sun's Sun Ray ( Sun Virtual desktop Infrastructure (VDI)
  • HP VDI  & 6720t Mobile thin client(including Neoware)
  • Dell ( e.g OptiPlex & Flexible computing solution)

Why Virtualised?
  • Cost
  • Reduces administration
  • Fast Deployment
  • Reduced Infrastructure Cost 

Not suited for cloud :
  • Security sensitive data & app
  • Graphic Intensive app
  • Business Intelligence and data warehousing

Best Practices 
  • Find the right vendor
  • Read the fine print
  • Measure the performance
  • Spread service across more than one reliable cloud vendor
  • Data should be easily view & port over by owner
  • Data security and data flow
  • Phase by phase approach
#################################################
Based on Gartner's prediction on Cloud Computing Market

Phase I : 2007 -2011 (Pioneer & Trailblazer)
  • Cloud Market development phase, and SEAP(service enabled application platform)

Phase II : 2010 -2013 (Could Market consolidation)
  • By 2012, SEAP market will be overcrowded with broad range of cloud solutions

Phase III : 2012 - 2015 & beyond (mainstream critical mass & commoditization)
  • Cloud market matures and small dominating cloud player providing de facto standards.

References:
Cloud Computing- A Practical Approach, by Anthony T.Velte, Toby J Velte & Robert Elsenpeter (McgrawHill)

Standard used in cloud computing

Communication
  • HTTP - http1.1
  • XMPP(Extensible Mesaagingand Presence Protocol) also knownas "jabber")
Security
  • SSL
  • OpenID
  • PCI DSS (Payment Card Industry Data Security Standard)
Client
  • HTML
  • Dynamic HTML (DHTML)
  • Document Object Model (DOM)
  • Javascript
Infrastructure
  •  Virtualisation
    • Open Hypervisor Standards
    • Community Source (e.g in VMware ESX server)
    • Open Virtualization Format (OVF)
Web Services
  • REST(Representational state transfer)
  • SOAP (Simple Object Access Protocol)
  • JSON(Javascript Object Notation)
  • XML

Thursday, March 25, 2010

Idea of Web 2.0

Idea:
  • The Web as a platform ( Web is the interconnection of applications and devices)
  • Collective Intelligence is harnessed
  • Valuable Data commodity
  • Software that cross boundaries
  • It gets better the more users used it
  • Continuous growth and improvement
  • Social rich-user experience
 Offshoots from web 2.0
    • Enterprise 2.0
    • Government 2.0
    • Law 2.0
    • Library 2.0
    • Media 2.0
    • Advertising 2.0
    • Music 2.0
    • Identity 2.0
    • Democracy 2.0

Sunday, March 21, 2010

Cloud Computing and the major players

Google
  • Google App Engine -empower developer to build web-apps on cloud infrastructure
  • Google Web Toolkit - a PaaS, tools based on Java with AJAX support for development & debugging
Amazon
  • EC2 - Elastic Compute Cloud is a web service that offer elasticity capcity in cloud.
  • SimpleDB
  • S3 - Simple Storage Service is Amazon's cloud storage solutions
  • CloudFront - webservice for content delivery on cloud.
  • SQS- Amazon'z Simple Queue Service is a cloud messaging system
  • EBS - Elastic Block store is Amazon's persistent storage

Microsoft
  • Azure Service Platform ( cloud services platform hosted by microsoft)
  • Exchange Online
  • Sharepoint services
  • Microsoft Dynamic CRM
Saleforce.com
  • Force.com - on demand PaaS 
  • Visualforce - tool fordesigning  business apps
  • AppExchange - directories of apps for salesforce.com by 3rd party.
  • Salesforce.com CRM

EMC
  • VMware 
  • Symmetric V-max system - Management system to manger virtual datacenters
  • Storage leader

NetApp
  • Cloud Storage & data management solutions
  • Unified storage architecture (Joint with Cisco)
Cisco
  • Unified Computing System
IBM
  • IBM Global Business Services
  • X-Force
SAP-IBM
  • Partnership with IBM on project RESERVOIR(REsource and SERvices Virtualisation withOut barrIeRs)

Yahoo
  • Yahoo! Research & CRL(Computational research laboratories)
HP-Intel-Yahoo!
  • creating a global multidatacenter, opensource test bed for cloud computing research.
  • partnership with Singapore's IDA, Universitiy of Illinois & Karlsruhe Inst of Technology(Germany)

Friday, March 19, 2010

FB Architecture - Brief

One of the most, if not the most often used social networking site, facebook. Today, I will try to summarize the Facebook(FB) architecture of what I have understood from Aditya(Dir. Engineering, Facebook) talk in Dec 08. (Note: this is my intepretation of the architecture look from the talk.)



Basically FB utilized the LAMP framework, but the modified version, LAMP stands for Linux, Apache, Mysql and PHP/Perl. The wise people at FB has modified and enhance most of the LAMP components, especially Mysql & PHP while including additional services to create FB. Web 2..0 tools like the Scribe, Thrift & ODS  were employed in developing the FB. A giant hash memory table or Memcache also helped to create a fast responsive FB. Services are like, newsfeed,Adserver, Search, network selector,CSS Parser, ShareScrapper, mobile  and blogfeed make up the complete FB architecture.

A notable mentioned in FB design is 'memcache'. Why memcache?. Its is faster and of high performance, alleviate the load on databases, cache serialized PHP data structures and multi-get to retrieve data. FB has modified memcache to run over UDP reducing the overheads and the persistent connection of TCP.

One key note that data can be slightly delay in facebook and need not be accurately realtime, although FB will try to be consistent. Data is also sorted in recentcy layout for optimizations. Data of the form key-value pair is evenly distributed across multiple DB instances and used LB. This allow FB to scale fast and do not require replications. The query in FB application is often a simple query type and they do not allow joins, and need not too.

One thing to always remember is that FB is not just a site about social-usages, but it also allow others to contribute and grow its applications. Developer can write FB apps, and using FBML extensions to easily interface and publish program. Of course the FB application's one developed has to be hosted somewhere other than the FB site. This flexibility allow fast adoption and wide use of FB.

Tuesday, March 16, 2010

How oracle start its database.

Let get into the backend scene that happen when we start an Oracle database. First assuming the system have already installed and create an Oracle Database.

Pre-requisite
  • ORACLE_HOME is defined ( normally is the Oracle DB install path)
  • ORACLE_SID is known (The Instance DB that we want to connect)

Processes
  1. Setting up the Oracle environment for the user.(Oracle DBA User, ORACLE_HOME & ORACLE_SID)
  2. Using "oracle" dba account issue >> startup [db]
  3. Oracle will lookup the spfile [$ORACLE_HOME/spfile$ORACLE_SID.ora]<-- binary file
  4. With the spfile, oracle will then create SGA (System Global Area) in the memory.
  5. All the oracle processes will also be started for the instance.(DBWR,LGWR,SMON,PMON.etc)
  6. The SGA will be assigned spaces for 
    • Fixed Buffer
    • Variable Buffer
    • DB Buffer
    • Log Buffer
  7. Next oracle will look for the control files that is mentioned in the spfile. Control files is crucial as it allow for
    • location of 'df files like datafiles','redofiles','tempfiles' & 'logfiles'
    • check global db file consistencies
    • Check if need to do any rollback
    • At the point the datafiles will be known.
  8. Database can then be mounted
  9. Next Database will then be open
  10. If the connections are from other systems or client users, we need to start the listener (LSNR) >> lsnrctl start
Basically this  are the normal steps an oracle database goes through during a database start.

Monday, March 15, 2010

Oracle Database Physical & Logical storage architecture

Oracle storage design of database architecture is interesting and good. The concepts of having Physical & Logical components are really helpful. For example, DBAs can maintain the physicals structures while the logical instance are still running.

Clearly isolating this two tiers are helpful to DBA & developers. Developer do not need to remember which data file that the database is assigned. Administrator can scale the Tablespaces as and when needed to expand or reduce by adding or reducing the data files.

PHYSICAL

Datafiles

  • Take note : A datafile can only be assigned to a tablespace, but a tablespace can have many datafiles

  • When ever user request data, oracle retrieve that data from these datafiles with the help of processes in the SGA (System Global Area). Manipulated data is written back to these datafiles, so that changed data is available to all users.

Beside the physical data files, there also other physical files like the redo log files, and control files.

Redo Log files

  • The redo log files store the change data entries generated by DML. 
  • It will be used during database recovery processes. 
  • Ofter  redo log files  are  also archived, and it is copied offline for recovery purpose. (archived redo logs)

Control Files

  • Control files is important file to start the DB instances.It contatin the initiatisation info of the DB(init.ora)
  • It record information about physical structure of database, such as datafiles size and location, redo log files location.


LOGICAL

Tablespace
  • Tablespace is a logical structure which holds other logical structure of database. 
  • This is further broken into segments. segment is further broken into extents, and extent intodata block. 
  • Tablespaces have two main type: Data & Index TS and Undo TS.
Segments
  • Table in the database will store into a  Data Segment.
  • Index in the database will  be store in Index Segment. 
  • Temporary Segment
  • Rollback Segment.

Extents
  • A segment is further broken into extents. 
  • An extent consists of one or more data block. 
  • When the database object is enlarged, an extent will be allocated. 
  • An extent cannot be named.

Data Block
  • Smallest unit of storage in the database. 
  • The data block size is a specific number of bytes within tablespace and it has the same number of bytes. 
  • It consist of multiple operating system blocks.

Thursday, March 11, 2010

Architecture concept for web 2.0

Architecture has made design and building IT into a more consistent and manageable approach. I have recently study some architectural patterns and perhap it is a good idea for my record purpose to blog this web 2.0 pattern in for future reference.

Basically there 5 main component groups for web 2.0.

Design &
Devolopment   --> Client/App Tier
                                      |
        |                      Connectivity Tier
        |                              |
        |                      Service Tier
        |                             |
     [ Resource Tier               ]


Client/App Tier
This tier allow users to interface the services. It is where users interact with the web 2.0.
Component such VM,portals,media renditions, security controls maybe present in this tier.

Connectivity Tier
This tier support the standard connection protocols that support all the connectivity matter in the architecture. Most often the technology is define by community experts like W3C, OASIS, etc.(e.g like XML/HTTP)

Service Tier
This is where the resources is package and group as a service. This is also where BusinessRule(BR) and workflow is added.( e.g PHP, ASP, Rails, SOAP)

Resource Tier
This tier support the repository of data and support processing that is needed to create Rich Internet Application.(ERP,CRM,DB,MQ, LDAP,LegacySystem)

Design & Development
This architecture tier support the development of all the other 4 architectural components by providing standardization and control in the Integrated Development Environment(IDE) for example.

Tuesday, March 9, 2010

Web 2.0 quotes

" Web 1.0 was about connecting computers and making technology more efficient for computers. Web 2.0 is about connecting people and making technology efficient for people."
Dan Zanbonini

" The Internet is a platform spanning all devices " 
Tim O'Reilly, 2005

"Trust I seek and I find in you
Every day for us, something new
Open mind for a different view
and nothing else matters."
(Nothing Else Matters" by Metallica

"If you build it.. they will come"
Field of Dreams

Institution takes on Cloud.

I had the opportunity to catch a short seminar by Tan Chee Chiang, Assoc. Director, High Performance Computing, Computer Centre, NUS. Briefly here what I summarised from the interview:-

From institution point of view cloud computing:-
  • Allow for high density servers requiring special power and cooling
  • Short term peak during term end, and project peak period
  • Agility to function like a rubber-band in providing services
Challenges:-

  • Stable connectivity between consumer and the cloud service provider.
  • Data security in the cloud.

Sunday, March 7, 2010

Some common Web 2.0 Patterns

The following are some common architecture patterns.
  • Service Oriented Architecture(SOA)
  • Participation Collaborations
  • Software as a Service(SaaS)
  • AJAX( or Asynchronous component update)
  • Rich Internet Application(RIA)
  • Synchronized web
  • Mashup
  • Folksonomy
  • Tag Gardening
  • Semantic Web Grounding
  • XML(Structured Information)
  • Persistent Right Management

Saturday, March 6, 2010

Why Web 2.0 is what it is..

" Web 2.0 is the network as a platform, spanning all connected devices; Web 2.0 applications are those that make the most of the intrinsic advantage of that platform:-
  • delivering software as a continually updated service that gets better the more people use it,
  • consuming and remixing data from multiple sources, including individual users while providing their own data and services in a form that allows remixing by other,
  • creating netwrok effects through an "architecture of participation" and
  • going beyond the page metaphor of Web 1.0 to deliver rich user experiences.", Tim O'Reily
With rapid adoption of broadband internet and nation rushing to implement high speed next generation networks, content providers are able to leverage on Rich Internet Applications and provide Web 2.0 sites. Mobile devices like iphone, androids, and many more devices to come has in a lot of ways accelerate the adoption of Web 2.0.

The concept of user participation's leads to viral marketing, where advertisement and marketing can be promoted faster and wider than ever before. Content can be even created by users to drive the product.

But Web 2.0 is more than just about internet technology revolutions. It is a change that has affect human to human interactions, business or personals. And it has become an integral part of peoples life, at least those staying in areas with internet connectivity. Part of the success recipe in web 2.0 is the inclusion of users as a critical ingredient which engage and provides key functionality and content.

Not just the Economics driving web 2.0, Politics are well in this space to utilise the participatory, netizen based platform that has become the truth. Mainstream media like the press/tv news that used to dominate starts to loose foothold in spreading information. With Web 2.0, an average person can participate, contribute and dominate the web 2.0 world. As evidence in the US president election in 2008, the wide use of Facebook, and other social-networking platform has aided in canvassing and garnering support. This has certainly change the way the world function today and web 2.0 has certainly nudge the balance of power toward the average person.

The pace of innovation and introduction of technologies is ever increasing. We are in these exciting phase of the world era, so just accept it and called it Web 2.0 world.

Thursday, March 4, 2010

Augmented Reality

These words is becoming more norms as mobile devices, game consoles and even websites, start to use and drive the Augmented Reality technology faster and faster. Games console like Wii, Apple iPhone a few common augmented reality tools has been experimenting and creating applications to enhance sound, sights, smells and effects by creating a new world where reality meets computer generated layers.

I guess all the regular BPL fans will attest see-ing the famous Barclay's logo at the center of the football field just before the kickoff. In reality, the logo was not painted on the grass, it is where the advance of the computer augmented reality concept creating a layer ads over the football field. The offside line now can easily be replayed and marked with a shade using this video augmented techniques. The applications are wide. For the football case, advertisements revenue can be grown and expanded using this technology. The sideline boards can be reconfigure with advertisement pertaining to the locality. TV stations can then subsidise using part of the ads revenue on the match itself and thus giving a better cost charging to consumers. In nintendo Wii, we saw how various Wii consoles can emulate devices creating the impression of actually acting the part like playing Tennis or bowling. Some website like Wikitude is starting to use location info along with camera phone to cleverly interpret places of interest.

Basically Augmented Reality is getting to be the norm. In Japan some movie even have sensory responders to give certain scent during the movie to create a more realistic appreciation of the movie. Universal Studios 4D shows will sprayed out water if there screen showing running down the speedy boat, and make blow heat to signify if you adventuring up the volcanoes.

While we let augmented reality catch up with the reality, I just had to mentioned the scene from the movie Minority Report where augmented reality becomes more real than sometime reality itself. In that future, likely the line between realities are no longer there.

Wednesday, March 3, 2010

IT Service Building Block

Service building block comprises of software components. It can be group into 3 areas.

Presentation <---  Connectors  <----  Services


  • Presentations
    • HTML pages
    • Java servlets
    • JSP
    • AJAX
  • Connectors
    • JMS( Java Messaging Services)
    • JNDI ( Java Naming & Directory Interface)
    • JDBC ( Java Database Connector)
    • JNI ( Java Native Interface)
    • RMI (Remote Method Invocation)
    • JAXP (Java API for XML Parsing)
    • JCA ( Java Connector Architecture)
  • Services
    • Order Entry
    • CRM (Customer Resource Management)
    • ERP (Enterprise Resource Planning)
    • OLTP
    • OLAP
    • DSS(Design Support System)
    • HRIS(Human Resource Info Sys)
    • Remote Services
    • Provisioning

Infrastructure Tiers

  • Presentation Tier
    • Web Server
    • Portal Server
    • Cache Server
    • Print Server
    • WAP Server
  • Business/Application Tier
    • Application Server(Weblogic, WAS)
    • Mail Server(SMTP,POP)
    • Fax Server(Hylafax)
    • SMS Server
  • Integration Tier
    • Messaging Server(MQ)
    • Directory Server(LDAP, Exchange)
  • Resource Tier
    • Database Server
    • FTP Server
    • NFS Server
    • Directory Server
    • Report Server 
    • Replication Server
  • Systemic Tier
    • Load Balancing Server
    • Certificate Server(SSL)
    • Security Server(Syslog-ng, IDS-Intrusion Detection Server)
    • Monitoring Server(Nagios, CA-Health, Introscope)
    • Time Server(NTP)
    • DNS Server
    • Backup Server

Tuesday, March 2, 2010

AIR

AIR has been around for few years now and there more than 100million AIRapps out there. So what exactly is AIR, since most of the modern website often required patrons to accept the AIRapps.

From wiki definition on 2 March 2010,(Extracted from WIKI)
AIR or Adobe Integrated Runtime is a cross-platform runtime environment developed by Adobe Systems for building rich Internet applications using Adobe Flash, Adobe Flex, HTML, or Ajax, that can be deployed as a desktop application.


AIR can be operated offline, and when Internet connection is made avalaible again the function will get activated or updated or you can also upload data.

AIR helps web developer as a tool to build rich internet apps.

Monday, March 1, 2010

RSS

What is RSS.
RSS is a Web content syndication format, which in long stands for  Really Simple Syndication.

Browsers can read directly RSS files, or you can also used RSS reader or aggregator.

RSS is a format to share information in an XML file. Currently RSS is at ver 2.0 controlled by W3C.

Example as follows

1) An xml based file(rss feed) to be uploaded to the website




2) Use the following link "http://www.feedvalidator.org/" to validate the rss feeds.

3) Add the link to deploy the RSS in the page.

Folksonomy

Folksonomy:

- is done by the person using the information.
- is about free tagging of information and objects.
- is about tags used for retrival. 
- tagging is done in a shared and open social environment.


Three ELEMENTS in folksonomy
1) Person tagging
2) Object being tagged
3) Name of the tag


The Values
- the tagging is derived from people using their own vocabulary and adding explicit meaning,
- Allowing people to participate in the tagging in itself create value to the Internet world  and by which making folksonomy an important tool in Web 2.0

Web 2.0 => Enterprise 2.0

Web 2.0 media tools like Twitter and Facebook and applications such as wikis, blogs and forums have allowed collaboration and knowledge management in which helps to transform companies into an Enterprise 2.0 organisation.

With wikis and blogs, employees can participate/exchange their experiences and expertise in the way of sharing knowledge. This also benefits the organization in term of knowledge management and a participative employees usually translate into a motivated and progressive workforce. It also allow knowledge to be available across multiple location and support the globalization workforce of today.

Of course the concern & challenges of enterpise 2.0 will then be of security, governance and ethics.