Thursday, January 30, 2014

Learning Nutch Code

Nutch extension points
parse-plugins.xml represents a natural ordering for which parsing plugin should get called for a particular mimeType.
ParseFilter, Parser always used together, is called in nutch parse task. 

Parser implementation: HtmlParser, TikaParser, only one parser will be run for specific page

In Parser, the Webpage param is filled with value from Fetcher, such as baseUrl, metadata, contenttype, headers.
It's content is a ByteBuffer, all html page. At this title, title, text and other values are not set, is null.

The parser will parse the content, get the pure text(html tag excluded) , get the title, outlinks.
byte[] contentInOctets = page.getContent().array();
InputSource input = new InputSource(new ByteArrayInputStream(
          contentInOctets));
utils.getText(sb, root);
text = sb.toString();
Parse parse = new Parse(text, title, outlinks, status);

public interface Parser extends FieldPluggable, Configurable {
  Parse getParse(String url, WebPage page);
}

Permits one to add additional
metadata to parses provided by the html or tika plugins.  All plugins found which implement this extension
point are run sequentially on the parse.
org.apache.nutch.parse.ParseFilter
Parse filter(String url, WebPage page, Parse parse,
                    HTMLMetaTags metaTags, DocumentFragment doc);

IndexingFilter: to add new field, example: index-basic, index-anchor
IndexingFilter is called in IndexerJob during index task. IndexerJob creates IndexerMapper, no reduce jobs.
At this point, baseUrl, content, contentType, outlinks are not set.
text, title are set.

public Collection<WebPage.Field> getFields() {
return FIELDS;
}
FIELDS.add(WebPage.Field.TITLE);
FieldPluggable defines the getFields() methods.
BasicIndexingFilter
doc.add("content", TableUtil.toString(page.getText()));

The getFields() told nutch GoraMapper which fields to load.  We can add field into getFields() to tell nutch to load fields from its data store, Nutch will return data as long as the field is filled, and already has data.
IndexerJob collects all fields from all IndexingFilter.

Each job(such as fetcherJob, parserJob)
ParseJob collects all fields from ParseFilter.
FetcherJob collects all fields from ParseJob and ProtocolFactory.
  private static Collection<WebPage.Field> getFields(Job job) {
    Configuration conf = job.getConfiguration();
    Collection<WebPage.Field> columns = new HashSet<WebPage.Field>(FIELDS);
    IndexingFilters filters = new IndexingFilters(conf);
    columns.addAll(filters.getFields());
    ScoringFilters scoringFilters = new ScoringFilters(conf);
    columns.addAll(scoringFilters.getFields());
    return columns;
  }

org.apache.nutch.storage.StorageUtils.initMapperJob(Job, Collection<Field>, Class<K>, Class<V>, Class<? extends GoraMapper<String, WebPage, K, V>>, Class<? extends Partitioner<K, V>>, boolean)
DataStore<String, WebPage> store = createWebStore(job.getConfiguration(),
String.class, WebPage.class);
Query<String, WebPage> query = store.newQuery();
query.setFields(toStringArray(fields));
GoraMapper.initMapperJob(job, query, store,
outKeyClass, outValueClass, mapperClass, partitionerClass, reuseObjects);
GoraOutputFormat.setOutput(job, store, true);

org.apache.nutch.indexer.IndexerJob.createIndexJob(Configuration, String, String)
Collection<WebPage.Field> fields = getFields(job);
StorageUtils.initMapperJob(job, fields, String.class, NutchDocument.class,IndexerMapper.class);
job.setNumReduceTasks(0);
job.setOutputFormatClass(IndexerOutputFormat.class);
SolrIndexerJob, ElasticIndexerJob
public class SolrIndexerJob extends IndexerJob {}

in Solr IndexerOutputFormat, getRecordWriter creates NutchIndexWriter, open them, and return a RecordWriter which is responsible to write doc to solr.
The writers are created at: final NutchIndexWriter[] writers =
      NutchIndexWriterFactory.getNutchIndexWriters(job.getConfiguration());
public void write(String key, NutchDocument doc) throws IOException {
for (final NutchIndexWriter writer : writers) {
 writer.write(doc);
}
}
NutchIndexWriter has 2 implementations: SolrWriter, and ElasticWriter. 
In SolrWriter, its open creates a httpsolrserver, write method sends doc to solr.
public abstract class OutputFormat<K, V>
{
  public abstract RecordWriter<K, V> getRecordWriter(TaskAttemptContext paramTaskAttemptContext)
    throws IOException, InterruptedException;

  public abstract void checkOutputSpecs(JobContext paramJobContext)
    throws IOException, InterruptedException;

  public abstract OutputCommitter getOutputCommitter(TaskAttemptContext paramTaskAttemptContext)
    throws IOException, InterruptedException;
}

Permits one to add metadata to the indexed fields.  All plugins found which implement this extension point are run
sequentially on the parse.
page.getText() will return 
public interface IndexingFilter extends FieldPluggable, Configurable {
NutchDocument filter(NutchDocument doc, String url, WebPage page)  throws IndexingException;
}
public interface FieldPluggable extends Pluggable {
  public Collection<WebPage.Field> getFields();
}

Gotcha:
no inlinks in nutch.

Wednesday, January 29, 2014

Learning Python

Using activepython, as it packages more modules/packages by default.

Check 32 or 64 bit version, whether the python package and pthon installed is same 32 or b4 bit.
Place python home and *python/scripts* folder in path environment variable: C:\Python27 and C:\Python27\Scripts

Use exe to install package if possible. It's hard to install from source code as have to install its dependencies manually.
use pip or easy_install to install packages
easy_install Scrapy,
python setup.py install 

Python27 needs visual studio express 2008, use 32bit version.
Install python32 bit.

The Windows binaries of the latest version of lxml (as well as a wide range of other Python packages) are available on http://www.lfd.uci.edu/~gohlke/pythonlibs/
http://www.lfd.uci.edu/~gohlke/pythonlibs/
Libxml-python are bindings for the libxml2 and libxslt libraries.

Tuesday, January 28, 2014

Learning XPath

http://www.w3schools.com/xpath/xpath_syntax.asp
/ Selects from the root node
// Selects nodes in the document from the current node that match the selection no matter where they are
. Selects the current node
.. Selects the parent of the current node
@ Selects attributes

bookstore/book Selects all book elements that are children of bookstore
//book Selects all book elements no matter where they are in the document
bookstore//book Selects all book elements that are descendant of the bookstore element, no matter where they are under the bookstore element
//@lang Selects all attributes that are named lang

Predicates
Predicates are used to find a specific node or a node that contains a specific value.
Predicates are always embedded in square brackets.
/bookstore/book[1]
/bookstore/book[last()]
/bookstore/book[last()-1]
/title[@lang] Selects all the title elements that have an attribute named lang
//title[@lang='eng'] Selects all the title elements that have an attribute named lang with a value of 'eng'
/bookstore/book[price>35.00] Selects all the book elements of the bookstore element that have a price element with a value greater than 35.00
/bookstore/book[price>35.00]/title Selects all the title elements of the book elements of the bookstore element that have a price element with a value greater than 35.00

Selecting Unknown Nodes
* Matches any element node
@* Matches any attribute node
node() Matches any node of any kind
/bookstore/* Selects all the child nodes of the bookstore element
//* Selects all elements in the document
//title[@*] Selects all title elements which have any attribute

Selecting Several Paths
By using the | operator in an XPath expression you can select several paths.
//book/title | //book/price Selects all the title AND price elements of all book elements
//title | //price Selects all the title AND price elements in the document
/bookstore/book/title | //price Selects all the title elements of the book element of the bookstore element AND all the price elements in the document

Monday, January 27, 2014

Android Projects Ideas

Learning Google App Script

https://developers.google.com/apps-script/

Custom menus in Google Docs, Sheets, or Forms
https://developers.google.com/apps-script/guides/menus
Google Docs, Forms, or new Sheets Older version of Google Sheets
function onOpen() {
  var ui = DocumentApp.getUi();
  // Or FormApp or SpreadsheetApp.
  ui.createMenu('Custom Menu')
      .addItem('First item', 'menuItem1')
      .addSeparator()
      .addSubMenu(ui.createMenu('Sub-menu')
          .addItem('Second item', 'menuItem2'))
      .addToUi();
}

function menuItem1() {
  DocumentApp.getUi() // Or FormApp or SpreadsheetApp.
     .alert('You clicked the first menu item!');
}

https://developers.google.com/apps-script/quickstart/macros

Friday, January 24, 2014

Learning Solr API

Parse params in Solr config file
Made same error again, time to write it down:
<requestHandler ...>
<lst name="defaults">
<str name="dataFolder">data-folder</str>
</lst>
</requestHandler>
This code doesn't work, dataFolder would be null: the data in params is: defaults={dataFolder=data-folder}
public void init(NamedList args) {
SolrParams params = SolrParams.toSolrParams(args);
dataFolder = params.get(PARAM_DATA_FOLDER);
}

Should use:
super.init(args);
if (defaults != null) {
dataFolder = defaults.get(PARAM_DATA_FOLDER);
}

Be sure to close SolrQueryRequest
org.apache.solr.request.SolrQueryRequestBase
The close() method must be called on any instance of this class once it is no longer in use. 

org.apache.solr.core.SolrResourceLoader.normalizeDir(String)
org.apache.solr.core.CoreDescriptor.getDataDir()
if (new File(instanceDir).isAbsolute()) {
        result= SolrResourceLoader.normalizeDir(SolrResourceLoader.normalizeDir(instanceDir) + dataDir);
} else  {
result= SolrResourceLoader.normalizeDir(coreContainer.getSolrHome() +
SolrResourceLoader.normalizeDir(instanceDir) + dataDir);
}

Using QParser to parse query string
QueryResponseWriter responseWriter = core.getQueryResponseWriter(solrReq);
QParser qparser = QParser.getParser(tmpQuery, "lucene", req);
SolrQueryParser parser = new SolrQueryParser(qparser, req
.getSchema().getDefaultSearchFieldName());
Query query = parser.parse(tmpQuery);
DocSet docSet = searcher.getDocSet(query);
DocSet interSection = docs.intersection(docSet);

Run Query in Solr Server
final SolrQueryRequest newReq = new LocalSolrQueryRequest(core, newParams);
final SolrQueryResponse newResp = new SolrQueryResponse();
SolrRequestHandler handler = core.getRequestHandler("/select");
core.execute(handler, newReq, newResp);

SolrIndexSearcher searcher = newReq.getSearcher();
NamedList<Object> valuesNL = newResp.getValues();
Object rspObj = (Object) valuesNL.get("response");
if (rspObj instanceof ResultContext) {
 ResultContext rc = (ResultContext) rspObj;
 DocList docList = rc.docs;
 DocIterator dit = docList.iterator();
 while (dit.hasNext()) {
  int docid = dit.nextDoc();
  Document doc = searcher.doc(docid, (Set<String>) null);
  //
 }
} else if (rspObj instanceof SolrDocumentList) {
 SolrDocumentList docList = (SolrDocumentList) rspObj;
 long size = docList.size();
 Iterator<SolrDocument> docIt = docList.iterator();
 while (docIt.hasNext()) {
  SolrDocument doc = docIt.next();
  docIt.remove();
  //
 }
} else {
 throw new RuntimeException("Unkown response type: "
   + ((rspObj != null) ? rspObj.getClass() : "null"));
}

how solr returnFields defined
ReturnFields returnFields = new SolrReturnFields( req );
rsp.setReturnFields( returnFields );

Using getStatistics to add stats info
public NamedList<Object> getStatistics() {}
private final AtomicLong numTimeouts = new AtomicLong();
private final Timer requestTimes = new Timer();
numErrors.incrementAndGet();

Parse Queries in SolrConfig: org.apache.solr.core.QuerySenderListener.newSearcher(SolrIndexSearcher, SolrIndexSearcher)

try
{
  List<NamedList> allLists = (List<NamedList>)args.get("queries");  
  if (allLists == null) return;
  for (NamedList nlst : allLists) {
    NamedList result = new NamedList();
    if (params.get("distrib") == null) {
      params.add("distrib", false);
    } 
    SolrQueryRequest req = new LocalSolrQueryRequest(core,params);
    SolrQueryResponse rsp = new SolrQueryResponse();
    SolrRequestInfo.setRequestInfo(new SolrRequestInfo(req, rsp));
    core.execute(core.getRequestHandler(req.getParams().get(CommonParams.QT)), req, rsp);
  }
} finally {
  if (req != null) req.close();
  SolrRequestInfo.clearRequestInfo();
}

Using SolrPluginUtils.docListToSolrDocumentList to convert Lucene DocList to SolrDocumentList

SolrPluginUtils.docListToSolrDocumentList(docList,req.getSearcher(), fieldSet, null)

LocalParams
SolrParams localParams = QueryParsing.getLocalParams(param, req.getParams());
key = localParams.get(CommonParams.OUTPUT_KEY, key);
String excludeStr = localParams.get(CommonParams.EXCLUDE);

Map<?,?> tagMap = (Map<?,?>)req.getContext().get("tags");

How to Buid Lucene/Solr

ant compile
ant test

ant eclipse

ant generate-maven-artifacts

Use maven to build:
svn update
ant get-maven-poms
cd maven-build

mvn -DskipTests install
mvn -Dtest=TestClassName test
 mvn -DskipTests source:jar-no-fork install
ant clean-maven-build

Wednesday, January 22, 2014

Android Studio/IntelliJ IDEA Keyboard Shortcuts

I am used to Eclipse keyboard bindings which seems more intutive: in eclipse, it use ctrl+shift+t to open class(type), ctrl+shift+R to open file(resource. In IntelliJ, it use ctrl+N(open New wizard in eclipse) , ctrl+shift+N, kind of counter-intutive.
But have to get used to it.
CTRL+SHIFT+A: short cut for short cut information.
Ctrl + N Go to class
Ctrl + Shift + N Go to file
Ctrl+F12, quick outline/File structure popup
Ctrl+W, select block (widens on subsequent presses)
Ctrl+D, duplicates current line (or selection, including methods, etc.)
CTRL+], move caret to code block end (and CTRL-[)
SHIFT+RETURN, open new line below current and move cursor

Ctrl + H Type hierarchy
Ctrl + Shift + H Method hierarchy
Ctrl + Alt + H Call hierarchy

Ctrl + B or Ctrl + Click Go to declaration
Ctrl + Alt + B Go to implementation(s)
Ctrl + Shift + I Open quick definition lookup
Ctrl + Shift + B Go to type declaration
Ctrl + U Go to super-method/super-class

Tuesday, January 21, 2014

Learning Solr Join Concept


http://wiki.apache.org/solr/Join
q={!join from=manu_id_s to=id}ipod
you specify the foreign key relationship by giving the from and to fields to join on.
http://localhost:8983/solr/select?q=ipod&fl=*,score&sort=score+desc&fq={!join+from=id+to=manu_id_s}compName_s:%28Belkin%20Apple%29



http://blog.griddynamics.com/2013/09/solr-block-join-support.html
SolrInputDocument has new methods getChildDocuments()/addChildDocument() for nesting child documents into a parent document. XML and Javabin formats are now capable to transfer them.
q={!parent which='type_s:parent'}+COLOR_s:Red +SIZE_s:XL 
Local parameter ‘which’ provides a filter which distinguishes parent documents from children ones. Keep in mind two important things about it:
it should not match any children documents;
it should always match all parent documents.

This {!parent} query can be combined with any other query and filter.
q=+BRAND_s:Nike +_query_:"{!parent which=type_s:parent}+COLOR_s:Red +SIZE_s:XL"
same can be achieved by employing filter query:
q={!parent which=type_s:parent}+COLOR_s:Red +SIZE_s:XL&fq=BRAND_s:Puma

Don’t try to constraint children by filter queries, it doesn’t work, because filter queries explicitly constraint {!parent} query.  
{!child of=type_s:parent}BRAND_s:Puma returns SKUs belongs to the single Puma product.


Add block support for JSONLoader
https://issues.apache.org/jira/browse/SOLR-5183

Proposal for nested document support in Lucene
http://www.slideshare.net/MarkHarwood/proposal-for-nested-document-support-in-lucene

Other Parsers
https://cwiki.apache.org/confluence/display/solr/Other+Parsers
Block Join Query Parsers
In terms of performance, indexing the relationships between documents may be more efficient than attempting to do joins only at query time, since the relationships are already stored in the index and do not need to be computed.
To use these parsers, documents must be indexed as child documents. Currently documents can only be indexed with the relational structure with the XML update handler. The XML structure allows <doc> elements inside <doc> elements.
ou must also include a field that identifies the parent document as a parent; it can be any field that suits this purpose.
<add>
  <doc>
  <field name="id">1</field>
  <field name="title">Solr adds block join support</field>
  <field name="content_type">parentDocument</field>
    <doc>
     <field name="id">2</field>  
     <field name="comments">SolrCloud supports it too!</field>
    </doc>
  </doc>
  <doc>
    <field name="id">3</field>
    <field name="title">Lucene and Solr 4.5 is out</field>
    <field name="content_type">parentDocument</field>
    <doc>
     <field name="id">4</field>
     <field name="comments">Lots of new features</field>
    </doc>
  </doc>
</add>
Block Join Children Query Parser
This parser takes a query that matches some parent documents and returns their children. The syntax for this parser is: q={!child of=<allParents>}<someParents>. The parameter allParents is a filter that matches only parent documents; here you would define the field and value that you used to identify a document as a parent. The parameter someParents identifies a query that will match some or all of the parent documents. The output is the children.
Using the example documents above, we can construct a query such as q={!child of="content_type:parentDocument"}title:lucene
Block Join Parent Query Parser
This parser takes a query that matches child documents and returns their parents. The syntax for this parser is similar: q={!parent which=<allParents>}<someChildren>. Again the parameter The parameter allParents is a filter that matches only parent documents; here you would define the field and value that you used to identify a document as a parent. The parameter someChildren is a query that matches some or all of the child documents. Note that the query for someChildren should match only child documents or you may get an exception.
Again using the example documents above, we can construct a query such as q={!parent which="content_type:parentDocument"}comments:SolrCloud. 

Collapsing Query Parser
fq={!collapse field=<field_name>}

http://www.slideshare.net/lucenerevolution/grouping-and-joining-in-lucenesolr

Join Contrib
https://issues.apache.org/jira/browse/SOLR-4787

Solr(Cloud) should support block joins
https://issues.apache.org/jira/browse/SOLR-3076

Block Indexing / Join Improvements
https://issues.apache.org/jira/browse/SOLR-5142

Add scoring support for query time join
https://issues.apache.org/jira/browse/LUCENE-4043
Solr uses a different joining implementation. Which doesn't support mapping the scores from the `from` side to the `to` side. If you want to use the Lucene joining implementation you could wrap this in a Solr QParserPlugin extension.

Wednesday, January 15, 2014

Stemming and Lemmatisation

Lemmatisation is the process of grouping together the different inflected forms of a word so they can be analysed as a single item. It's the algorithmic process of determining the lemma for a given word.

that one might look up in a dictionary, is called the lemma词条 for the word. The combination of the base form with the part of speech is often called the lexeme词位 of the word.

a stemmer operates on a single word without knowledge of the context, and therefore cannot discriminate between words which have different meanings depending on part of speech. However, stemmers are typically easier to implement and run faster, and the reduced accuracy may not matter for some applications.
the purpose of stemming is not to produce the appropriate lemma – that is a more challenging task that requires knowledge of context. The main purpose of stemming is to map different forms of a word to a single form.

Lemmatization is closely related to stemming but unlike stemming, which operates only on a single word at a time, lemmatization operates on the full text and therefore can discriminate between words that have different meanings depending on part of speech.

Stanford CoreNLP
http://nlp.stanford.edu/software/corenlp.shtml

Lemmatizer
https://wiki.searchtechnologies.com/index.php/Lemmatizer
LemmaGen
http://lemmatise.ijs.si/

KeywordRepeatFilter and RemoveDuplicatesTokenFilterFactory
In Solr 4.3, the KeywordRepeatFilterFactory has been added to assist this functionality. This filter emits two tokens for each input token, one of them is marked with the Keyword attribute. Stemmers that respect keyword attributes will pass through the token so marked without change. So the effect of this filter would be to index both the original word and the stemmed version. The 4 stemmers listed above all respect the keyword attribute.

For terms that are not changed by stemming, this will result in duplicate, identical tokens in the document. This can be alleviated by adding the RemoveDuplicatesTokenFilterFactory.

Learning Solr CJK

http://discovery-grindstone.blogspot.com/2014/01/cjk-with-solr-for-libraries-part-7.html
Equate Traditional Characters With Simplified Characters

The CJKBigramFilter, new with Solr 3.6, allows us to generate both the unigrams and the bigrams for CJK scripts only.
The CJKBigramFilter must be fed appropriate values for the token type.
a. ICUTokenizer 
b. StandardTokenizer 
Note that the StandardTokenizer output does not separate the Hangul character from the Latin characters at the end. 
c. ClassicTokenizer 
I believe ClassicTokenizer will also assign token types, probably the same way StandardTokenizer does.

2.  ICU Script Translations
Solr makes the following script translations available via the solr.ICUTransformFilterFactory:
Han Traditional <--> Simplified
Katakana <--> Hiragana

3.  ICU Folding Filter
We have already been using solr.ICUFoldingFilterFactory for case folding, e.g. normalizing "A" to "a
its an efficient single-pass through the string. For practical purposes this means you can use this factory as a better substitute for the combined behavior of ASCIIFoldingFilter, LowerCaseFilter, and ICUNormalizer2Filter

Solr Fieldtype Definition
positionIncrementGap attribute on fieldType
This setting is all about trying to keep your matches within a single field value for a multivalued field.
the values in a multivalued field are stored adjacently, and this setting is the number of pretend tokens between the field values.  Thus, a large value keeps phrase queries from matching some words at the end of one field value, and some words in the beginning of the following field value. 

autoGeneratePhraseQueries attribute on fieldType
As LUCENE-2458 describes, prior to Solr 3.1, if more than one token was created for whitespace delimited text, then a phrase query was automatically generated.  While this behavior is generally desired for European languages, it is not desired for CJK.  

CUTokenizerFactory
The section above on CJKBigram analysis shows that the ICUTokenizer is likely to be better than the StandardTokenizer for tokenizing CJK characters into typed unigrams, as needed by the CJKBigramFilter.
CJKWidthFilterFactory
It may be that this is completely unnecessary, but on the off chance that the script translations don't accommodate half-width characters, I go ahead and normalize them here.  


CJK with Solr for Libraries, part 8
http://discovery-grindstone.blogspot.com/2014/01/cjk-with-solr-for-libraries-part-8.html
I coded the above using rspec-solr (http://rubydoc.info/github/sul-dlss/rspec-solr) 
http://explain.solr.pl/
mm setting
Our mm setting was  6<-1 6<90%, which translates to:  for 1 to 6 clauses, all are required; for more than 6 clauses, 90% (rounded down) are required. 

I chose this mm setting for CJK: 
3<86%
Per the mm spec, this says for three or fewer "clauses" (tokens), all are required, but for four or more tokens, only 86% (rounded down) are required.  This is perfect for CJK queries of 6 or fewer characters, a tad high for 7 characters, and perfect again for 8 characters, and seems to be the best fit available.

http://discovery-grindstone.blogspot.com/2014/01/cjk-with-solr-for-libraries-part-9.html
qs setting for phrase searches
Dismax and edismax have a "query phrase slop" parameter, qs, which is the distance allowed between tokens when the query has explicitly indicated a phrase search with quotation marks.  Probably from back in our stopword days, we use a setting of qs=1, meaning a query of "women's literature", with the quotes, is allowed to match results containing 'women and literature' as well as 'women in literature' in addition to 'women's literature'.  Because of the magic of pf sorting the best matches first, this has worked just fine for our users up until now.  However, with CJK queries, this is undesirable -- an explicit phrase query in CJK should only match the exact characters entered, with nothing inserted between them:qs=0

catch-all field

Halfwidth and fullwidth forms

http://en.wikipedia.org/wiki/Halfwidth_and_fullwidth_forms

Tuesday, January 14, 2014

Learning Gradle

With Gradle, we can automate the compiling,
testing, packaging, and deployment of our software or other types of projects. Gradle
is flexible but has sensible defaults for most projects. This means we can rely on the
defaults, if we don't want something special, but can still use the flexibility to adapt
a build to certain custom needs.

Declarative builds and convention over configuration
Gradle uses a Domain Specific Language (DSL) based on Groovy to declare builds.
The DSL provides a flexible language that can be extended by us.

Support for Ant tasks and Maven repositories
Gradle wrapper
The Gradle wrapper allows us to execute Gradle builds, even though Gradle is not
installed on a computer.
Gradle
bundles the Groovy libraries with the distribution and will ignore a Groovy
installation already available on our computer.

A directory named init.d where we can store Gradle scripts that
need to be executed each time we run Gradle

gradle -v
If we want to add JVM options to Gradle, we can use the environment variables
JAVA_OPTS and GRADLE_OPTS.

build.gradle
task helloWorld << {
println 'Hello world.'
}
The << syntax is operator shorthand for the method leftShift(), e
want to add the closure to our task with the name helloWorld.

gradle helloWorld
--quiet (or -q)\
gradle --quiet helloWorld

Default Gradle tasks: gradle -q tasks

gradle -q help
gradle -q properties
The properties task is very useful to see the properties available to our project. 
gradle -q dependencies
gradle -q projects

Task name abbreviation
gradle -q hello
gradle -q hW

Executing multiple tasks: gradle helloWorld tasks

Command-line options
gradle --help
--build-file (or -b) and --project-dir (or -p).
gradle --project-dir hello-world -q helloWorld

Running tasks without execution
With the option --dry-run (or -m), we can run all the tasks without really executing
them.

Gradle daemon
We can reduce the build
execution time if we don't have to load a JVM, Gradle classes, and libraries, each time
we execute a build. The command-line option, --daemon, starts a new Java process
that will have all Gradle classes and libraries already loaded, and then we execute
the build. The next time when we run Gradle with the --daemon option, only the
build is executed, because the JVM, with the required Gradle classes and libraries,
is already running.
gradle --daemon helloWorld
gradle --no-daemon helloWorld
gradle --stop

alias gradled='gradle --daemon'
export GRADLE_OPTS="-Dorg.gradle.daemon=true"

Profiling: --profile
The data is saved in an HTML file in the directory build/reports/profile.

--gui: gradle --gui
project.description = 'Simple project'
setDescription("Simple project")
project.description project.getDescription()

We can use the doFirst and doLast methods to add actions to our task, and we can
use the left shift operator (<<) as a synonym for the doLast method. With the doLast
method or the left shift operator (<<) we add actions at the end of the list of actions
for the task. With the doFirst method we can add actions to the beginning of the
list of actions.

println "Running ${task.name}"

Learning Android Studio

An activity is an instance of Activity, a class in the Android SDK. An activity is responsible for managing user
interaction with a screen of information.

Alt + 1 to open the project view
drawable/: A folder that contains the images used in our application. There are different drawable folders categorized into the different screen densities.

layout/: A folder that contains the XML definitions of the views and their elements.
menu/: A folder that contains the XML definitions of the menus of the application.
values/: A folder that contains the XML files that define sets of name-value pairs. There are different values folders categorized into different screens options to adapt the interface to them.
AndroidManifest.xml: This file declares basic information needed by the Android system to run the application, package name, version, activities, permissions, intents, or required hardware.
build.gradle: This file is the script used to build our application.

Editor settings
Change font size (Zoom) with Ctrl + Mouse Wheel.
Show quick doc on mouse move.

Appearance
Show line numbers
Show method separators

Editor Tabs:
Select the Mark modified tabs with asterisk option to easily detect the modified and not-saved files.

Auto Import:
Add unambiguous imports on the fly

Code completion: Ctrl+Space
To open the smart suggestions list, press Ctrl + Shift + the Spacebar.
Completion of statements. Type a statement, press Ctrl + Shift + Enter.

Code generation: 
Code | Generate: Alt + Insert
Code | Surround With or press Ctrl + Alt + T
Code | Insert Live Templates to open a dialog box of the available templates. 

Navigating code
Press Ctrl and click on the symbol or Navigate | Declaration

custom region is just a piece of code that you want to group and give a name to. For example, if there is a class with a lot of methods, we can create some custom regions to distribute the methods among them. A region has a name or description and it can be collapsed or expanded using code folding.

Useful actions
Ctrl + W: Selects the expressions based on grammar. Keep pressing these keys again and again to expand the selection. The opposite command is Ctrl + Shift + W.
Ctrl + /: Comments each line of the selected code. To use block comments press Ctrl + Shift + /.
Ctrl + Alt + O: Optimizes the imports, removing the unused ones and reordering the rest of them.
Alt + Arrows: Switches between the opened tabs of the editor.
Ctrl + F: Finds a string in the active tab of the editor.
Ctrl + R: Replaces a string in the active tab of the editor.
Ctrl + D: Copies the selected code and pastes it at the end of it. If no code is selected, then the entire line is copied and pasted in a new line.
Ctrl + Y: Removes the entire line without leaving any blank line.
Ctrl + Shift + U: Toggles case.
Tab: Moves to the next parameter.

Creating User Interfaces
To switch between the graphical and the text editor, click on the bottom tabs, Design and Text.

Layouts: A layout is a container object to distribute the components on the screen. The root element of a user interface is a layout object, but layouts can also contain more layouts, creating a hierarchy of components structured in layouts. The recommendation is to keep this layout hierarchy as simple as possible. Our main layout has a relative layout as a root element.
Containers: These are containers group components that share a common behavior. Radio groups, list views, scroll views, or tab hosts are some of them.

Creating a new layout
click with the right mouse button on the layouts folder (res/layout/) and navigate to New | Layout resource file. You can also navigate to the menu option File | New | Layout resource file.
in the toolbar of the layout editor, search for the activity option, click on it, and select the Associate with other Activity option.

layout:width: Its current value is wrap_content. This option will adapt the width of the field to its content. Change it to match_parent to adapt it to the parent layout width (the root relative layout).

Center horizontally the button in the parent layout using the layout_centerHorizontal property. 

Supporting multiple screens
The device definitions indicate the screen inches, the resolution, and the screen density. Android divides into ldpi, mdpi, hdpi, xhdpi, and even xxhdpi the screen densities.

ldpi (low-density dots per inch): About 120 dpi
mdpi (medium-density dots per inch): About 160 dpi
hdpi (high-density dots per inch): About 240 dpi
xhdpi (extra-high-density dots per inch): About 320 dpi
xxhdpi (extra-extra-high-density dots per inch): About 480 dpi

device orientation: landscape  or portrait
Create Landscape Variation
A new layout will be opened in the editor. This layout has been created in the resources folder, under the directory layout-land and using the same name as the portrait layout: /src/main/res/layout-land/activity_main.xml. 

imilarly, we can create a variation of the layout for xlarge screens. Select the option Create layout-xlarge Variation. The new layout will be created in the layout-xlarge folder: /src/main/res/layout-xlarge/activity_main.xml. Android divides into small, normal, large, and xlarge the actual screen sizes:

small: Screens classified in this category are at least 426 dp x 320 dp
normal: Screens classified in this category are at least 470 dp x 320 dp
large: Screens classified in this category are at least 640 dp x 480 dp
xlarge: Screens classified in this category are at least 960 dp x 720 dp
A dp is a density independent pixel, equivalent to one physical pixel on a 160 dpi screen.

in the toolbar click on the configuration option and select the option Preview All Screen Sizes, or click on the Preview Representative Sample to open just the most important screen sizes.
Save screenshot option, which allows us to take a screenshot of the layout preview.

If we create some layout variations, we can preview all of them selecting the option Preview Layout Versions.

Changing the UI theme
Styles and themes are created as resources under the /src/res/values.
Open the main layout using the graphical editor. The selected theme for our layout is indicated in the toolbar: AppTheme. This theme was created for our project and can be found in the styles file (/src/res/values/styles.xml). Open the styles file and notice that this theme is an extension of another theme (Theme.Light).

The selected theme for our layout is indicated in the toolbar: AppTheme. This theme was created for our project and can be found in the styles file (/src/res/values/styles.xml)
The themes created in our own project are listed in the Project Themes section. The section Manifest Themes shows the theme configured in the application manifest file (/src/main/AndroidManifest.xml). The All section lists all the available themes. 

All the UI widgets are children of the View class and they share some events handled by the next listeners:
OnCreateContextMenu: Captures the event when the user performs a long click on the view element and we want to open a context menu
OnTouchListener: Captures the event when the user touches the view element
public void onAcceptClick(View v) {
  TextView tv_greeting =
    (TextView) findViewById(R.id.textView_greeting);
  EditText et_name = (EditText) findViewById(R.id.editText_name);

  if(et_name.getText().length() > 0) {
    tv_greeting.setText("Hello " + et_name.getText());
  }
}
The R class is autogenerated in the build phase and we must not edit it. If this class is not autogenerated, then probably some file of our resources contains an error.

In case the event we want to handle is not the user click, then we have to create and add the listener by code in the onCreate method of the activity.

Adding Google Play Services to Android Studio
Tools | Android | SDK Manager. We can find Google Play Services in the packages list under the folder Extras. Select the Google Play Services checkbox and click on the Install 1 package... button.
/sdk/extras/google/google_play_services/

Add this JAR file to your project by just dragging it into the libs/ folder. Once this is done, select the JAR file and press the right mouse button on it. Select the Add as Library option. In the create library dialog, select the project library level, select your application module, and click on OK.

Finally, you will need to add the library to your Gradle's build file. To do this just edit the file MyApplication/build.gradle and add the following line in the dependencies section:

compile files('libs/google-play-services.jar')

Click on File | Import Project.

Tools | Android | SDK Manager
The Software Development Kit (SDK) Manager is an Android tool integrated in Android Studio to control our Android SDK installation. From this tool we can examine the Android platforms installed in our system, update them, install new platforms, or install some other components such as Google Play Services or the Android Support Library.

Android Virtual Device Manager
The Android Virtual Device Manager (AVD Manager) is an Android tool integrated in Android Studio to manage the Android virtual devices that will be executed in the Android emulator.

Emulation Options: The Snapshot option saves the state of the emulator in order to load faster the next time. Check it. The Use Host GPU option tries to accelerate the GPU hardware to run the emulator faster.

Tools | Generate Javadoc
VCS | Enable Version Control Integratio

Device Definitions | New Device
Waiting for device: Start point when the emulator is being launched.
Uploading file: The application is packed and stored in the device.
Installing: The application is being installed in the device. After the installation a success message should be printed.
Launching application: The application starts to execute.
Waiting for process: The application should now be running and the debug system tries to connect to the application process in the device.
To execute the next line of code without stepping into the method call, navigate to Run | Step Over or use the keyboard shortcut indicated for this option, usually key F8
To step into the method call, navigate to Run | Step Into or press F7
To resume the execution until the next breakpoint if there is one, navigate to Run | Resume Program or press F9
To stop the execution, navigate to Run | Stop or press Ctrl + F2

LogCat is the Android logging system that displays all the log messages generated by the Android system in the running device.
v method for debug level, d method for verbose, i method for information, w method for warning, and e method for error messages. 
Log.w("MainActivity", "No name typed, greeting didn't change");
Edit Filter Configuration
Log messages can be filtered by their tag or their content using regular expressions, by the name of the package that printed them, by the process ID (PID), or by their level.

DDMS
The Dalvik Debug Monitor Server (DDMS) is a more advanced debugging tool from the SDK that has also been integrated into Android Studio. This tool is able to monitor both a real device and the emulator.
Tools | Android | Monitor (DDMS included)

Click on the Update Threads icon button from the toolbar of the devices section and the threads will be loaded in the content of the tab.

Preparing for Release
APK (Application Package) file
Android applications are packed in a file with the .APK extension, which is a variation of a Java JAR (Java Archive) file. These files are just compressed ZIP files, so their content can be easily explored. An APK file usually contains:

assets/: A folder that contains the assets files of the application. This is the same assets folder existing in the project.
META-INF/: A folder that contains our certificates.
lib/: A folder that contains compiled code if necessary for a processor.
res/: A folder that contains the application resources.
AndroidManifest.xml: The application manifest file.
classes.dex: A file that contains the application compiled code.
resources.arsc: A file that contains some precompiled resources.

check the Unknown sources option from the Settings | Security
Applications have to be signed with a private key when they are built. An application can't be installed in a device or even in the emulator if it is not signed. To build our application there are two modes, debug and release. Both APK versions contain the same folders and compiled files. The difference is the key used to sign them:

Debug: The Android SDK tools automatically create a debug key, an alias, and their passwords to sign the APK. This process occurs when we are running or debugging our application with Android Studio without us realizing that. We can't publish an APK signed with the debug key created by the SDK tools.
Release:It is a requirement that the APK file is signed with a certificate for which the developer keeps the private key. In this case, we need our own private key, alias, and password and provide them to the build tools. The certificate identifies the developer of the application and can be a self-signed certificate. It is not necessary for a certificate authority to sign the certificate.

Previous steps
On a device using the minimum required platform
On a device using the target platform
On a device using the latest available platform
On a real device and not just in the emulator
On a variety of screen resolutions and sizes
On a tablet if your application supports them
Switching to the landscape mode if you allow it, both in a mobile device and in a tablet
On different network conditions, such as no Internet connectivity or low coverage
If your application uses the GPS or any location service, test it when they are not activated in the device
Behavior of the back button

Printing some log messages can be considered a security vulnerability. Logs generated by the Android system can be captured and analyzed, so we should avoid showing critical information about the application's internal working. You should also remove the android:debuggable property from the application manifest file. You can also set this property to false.

Finally, set the correct value for the android:versionCode and android:versionName properties from the application manifest file. The version code is a number (integer) that represents the application version. New versions should have greater version codes. This code is used to determine if an application installed in a device is the last version, or there is a newer one.

The version name is a string that represents the application version. Unlike the version code, the version name is visible to the user and appears in the public information about the application. It is just an informative version name to the user and is not used for any internal purpose.

Generating a signed APK
Build | Generate Signed APK.
Create new button to open the dialog box to create a new key store.
Alias: Alias for your certificate and pair of public and private key. For example, name it as releasekey.

Choose existing

For gradle build:
http://tools.android.com/tech-docs/new-build-system/user-guide#TOC-Signing-Configurations