What's new

Software Engineers Thread.

My Project is creating Data through the users that use our services . so we created a Graph based solution .

if you want some opensource i heard this is good Fast, Scalable Machine Learning Platform | Dato




I did this in Ruby on Rails . along with payments for 3 banks in just 3 hours :P

Looks promising. Right now we are using OrientDB/Neo4J for the relations, another solution for search and yet another solution, a key-value pair for some documents. These documents could be anything. Of course, you can see, how scalability and performance is a matter of concern here. If I can somehow tie the documents to the graph in a Big-Data like solution, then we may get better performance. Lets see, how it goes.
 
My B.Tech final year project is on IoT and using IoT protocols such as MQTT and oneM2M. Is anyone familiar with them here?

Looks promising. Right now we are using OrientDB/Neo4J for the relations, another solution for search and yet another solution, a key-value pair for some documents. These documents could be anything. Of course, you can see, how scalability and performance is a matter of concern here. If I can somehow tie the documents to the graph in a Big-Data like solution, then we may get better performance. Lets see, how it goes.
what kind of processing are you looking on? Apache Spark works great on in-memory fast processing. It's RDD's are blazingly fast with decent fault tolerance. traditional Apache top level projects are better for high density processing especially batch processing.
I have a research paper under review in algorithmic implementation differences between Apache spark and MapReduce.

My Project is creating Data through the users that use our services . so we created a Graph based solution .

if you want some opensource i heard this is good Fast, Scalable Machine Learning Platform | Dato




I did this in Ruby on Rails . along with payments for 3 banks in just 3 hours :P
Looking to hire someone :whistle::whistle:
 
My B.Tech final year project is on IoT and using IoT protocols such as MQTT and oneM2M. Is anyone familiar with them here?


what kind of processing are you looking on? Apache Spark works great on in-memory fast processing. It's RDD's are blazingly fast with decent fault tolerance. traditional Apache top level projects are better for high density processing especially batch processing.
I have a research paper under review in algorithmic implementation differences between Apache spark and MapReduce.
It's near-realtime.. not exactly realtime. We need to persist large graphs with documents as vertexes, where document may represent a text, image, video or audio. Depending upon type, there can be different type of indexes. Probably we will require our own DSL too, if we take that path. The first question one needs to ask over here is why re-invent the fire ? If existing solutions provide similar capabilities, even if not fully, still we might learn a thing or two from that.
 
It's near-realtime.. not exactly realtime. We need to persist large graphs with documents as vertexes, where document may represent a text, image, video or audio. Depending upon type, there can be different type of indexes. Probably we will require our own DSL too, if we take that path. The first question one needs to ask over here is why re-invent the fire ? If existing solutions provide similar capabilities, even if not fully, still we might learn a thing or two from that.
what volume of data are we looking at? Depending upon the technical expertise of the team, you could opt for varying solutions. There are few proprietary API's (SigStream) that do provide solutions, but i guess you are looking for a customized solution. I guess, you will have to create a custom solution which run different ML algo depending upon the type of data present at the vertices.
 
what volume of data are we looking at? Depending upon the technical expertise of the team, you could opt for varying solutions. There are few proprietary API's (SigStream) that do provide solutions, but i guess you are looking for a customized solution. I guess, you will have to create a custom solution which run different ML algo depending upon the type of data present at the vertices.
Need a scalable solution with about a few billion vertexes, with around a thousand more getting added each hour on a single node. Team is fine, some of the best guys are working on it. What matters is time.
We have one solution, that uses standard, readily available Graph DB, indexing solution, and Kay-value pair store. But, in the longer run, we would like to have a solution of our own. That's why, we have spawned this child project.
 
Need a scalable solution with about a few billion vertexes, with around a thousand more getting added each hour on a single node. Team is fine, some of the best guys are working on it. What matters is time.
We have one solution, that uses standard, readily available Graph DB, indexing solution, and Kay-value pair store. But, in the longer run, we would like to have a solution of our own. That's why, we have spawned this child project.

You know any good trainer for Big Data and/or Informatica?
 
You know any good trainer for Big Data and/or Informatica?
Informatica has a lot of products I think, I know only about Powercenter. It's an ETL tool. I have never worked on that , but there is a close friend of mine who does. I will talk to him and get back to you.
So far as ETL is concerned, I will also recommend Talend.

For Bigdata, you can always look into Hadoop implementation. It's a robust platform with easy integration available. Lot of books around. I recommend this. It will be boring to start with, but will be helpful to understand how it internally works.
Hadoop: The Definitive Guide
You can buy it on Kindle.
Cloudera also provides training and certification on Hadoop.
 
If you have a part time or small experience it would be okay
I have 1 research internship in IoT from BIT Mesra, 1 six week internship in apache server configuration and administration and summer training in Big Data Hadoop as well as RHCE 6.4.
P.S. plus few research papers in the field of IoT in reputed IEEE Journals are under review.
 
Guys please give me project ideas on Big Data, i have to submit it for my final year Project .... Your help would be appreciated!
 

Latest posts

Back
Top Bottom