Is “county subdivision” geography supported for get_acs() - tidycensus

I'm trying to upload data for "county subdivision" as part of the geography options in tidycensus' get_acs() function. I know there are several geography options, which Kyle Walker has published on his page. https://walkerke.github.io/tidycensus/articles/basic-usage.html#geography-in-tidycensus
And while it works fine for state and county level, because you would just put county = "Monmouth". But I can't seem to get the syntax to work at the city subdivision level for a city within Monmouth county. I've looked for other tidycensus scripts, but haven't found any using geographies below County level.
Any suggestions?
library(tidycensus)
library(tidyverse)
library(sf)
census_api_key("YOUR API KEY GOES HERE")
vars <- c(English = "C16002_002",
Spanish = "C16002_003")
language <- get_acs(geography = "county subdivision",
state = "NJ",
county = "Monmouth",
city = "Red Bank",
table = "C16001")
rb_language <- get_acs(geography = "tract",
variables = vars,
state = "NJ",
county = "Monmouth",
city = "Red Bank"
geometry = TRUE,
summary_var = "C16002_001") %>%
st_transform(26918)

I'm not entirely clear if you are trying to get data for the Red Bank county subdivision or census tracts within Red Bank. In either case, you can't do this directly in tidycensus, but rather you can get all subdivisions or tracts in a county using get_acs() and then further filter the results.
For example, if you just want language data for Red Bank county subdivision, you could do this:
library(tidycensus)
library(tidyverse)
library(sf)
library(tigris)
vars <- c(English = "C16002_002",
Spanish = "C16002_003")
# get all subdivisions in monmouth county
language_subdiv <- get_acs(geography = "county subdivision",
state = "NJ",
county = "Monmouth",
table = "C16001")
# only red bank borough
language_subdiv %>%
filter(str_detect(NAME, "Red Bank"))
#> # A tibble: 38 x 5
#> GEOID NAME variable estimate moe
#> <chr> <chr> <chr> <dbl> <dbl>
#> 1 34025624… Red Bank borough, Monmouth County, N… C16001_0… 11405 171
#> 2 34025624… Red Bank borough, Monmouth County, N… C16001_0… 7227 451
#> 3 34025624… Red Bank borough, Monmouth County, N… C16001_0… 3789 425
#> 4 34025624… Red Bank borough, Monmouth County, N… C16001_0… 1287 247
#> 5 34025624… Red Bank borough, Monmouth County, N… C16001_0… 2502 435
#> 6 34025624… Red Bank borough, Monmouth County, N… C16001_0… 0 19
#> 7 34025624… Red Bank borough, Monmouth County, N… C16001_0… 0 19
#> 8 34025624… Red Bank borough, Monmouth County, N… C16001_0… 0 19
#> 9 34025624… Red Bank borough, Monmouth County, N… C16001_0… 42 40
#> 10 34025624… Red Bank borough, Monmouth County, N… C16001_0… 0 19
#> # ... with 28 more rows
Now, if you wanted census tracts within Red Bank, you could grab all census tracts in Monmouth, then use tigris::places() to get the boundaries of Red Bank, and finally filter the census tracts to only get those tracts that are contained by the Red Bank boundary.
# get all tracts in monmouth county
language_tract <- get_acs(geography = "tract",
variables = vars,
state = "NJ",
county = "Monmouth",
geometry = TRUE,
summary_var = "C16002_001",
output = "wide")
# get geometry of red bank borough
red_bank_place <- places("NJ", cb = TRUE, class = "sf") %>%
filter(NAME == "Red Bank")
# only tracts in red bank borough
red_bank_tracts <- language_tract %>%
filter(st_contains(red_bank_place, ., sparse = FALSE))
ggplot() +
geom_sf(data = red_bank_tracts, color = "blue", fill = NA) +
geom_sf(data = red_bank_place, color = "black", fill = NA)
Created on 2018-12-24 by the reprex package (v0.2.1)

Related

Javascript multiple occurrences of the keywords

I have a text which contains some keywords followed by sentences like,
var data = "Name The United States of America (USA), commonly referred to as the United States (U.S.) or America, is a federal republic composed of 50 states, the federal district of Washington, D.C., five major territories, and various possessions. **About** 48 contiguous states and Washington, D.C., are in central North America between Canada and Mexico. The state of Alaska is in the northwestern part of North America and the state of Hawaii is an archipelago in the mid-Pacific. The territories are scattered **about** the Pacific Ocean and the Caribbean Sea. At 3.8 million square miles and with over 320 million people, the country is the world's third largest by total area and the third most populous. It is one of the world's most ethnically diverse and multicultural nations, the product of large-scale immigration from many countries. Life The geography and climate of the United States are also extremely diverse, and the country is home to **about** a wide variety of wildlife. Rest USA is a diversified nation and Niagara is world famous.";
In the above text, there are 4 keywords - Name, About,Life, Rest. I want to separate the text that follow these keywords into separate string arrays and populate them. The order in which these keywords appear in the text is always the same. I have tried the following code so far:
var name = [];
var about = [];
var life = [];
function transform_report(data) {
var keywords = ["Name", "About", "Life", "Rest"];
var output_data = "Event ";
var keyword_index = 0;
var input_data = data.toString();
var pos = -1;
for (var i = 0; i < keywords.length; i++) {
pos = input_data.indexOf(keywords[i]);
if (pos != -1) {
keyword_index = i;
break;
}
}
while (pos != -1) {
output_data += keywords[keyword_index] + " : ";
pos += keywords[keyword_index].length;
var index = keyword_index;
keyword_index = find_next_keyword(keywords, keyword_index, input_data, pos);
var end_pos = input_data.indexOf(keywords[keyword_index]);
var output_text = input_data.slice(pos, end_pos).replace(/:/, '');
output_data += output_text.trim() + "\n";
if (keywords[index] === "Name") {
name.push(output_text.trim());
}
if ((keywords[index] === "About")) {
about.push(output_text.trim());
}
if ((keywords[index] === "Life")) {
life.push(output_text.trim());
}
pos = end_pos;
}
return output_data;
}
function find_next_keyword(keywords, index, input_data, pos) {
var orig_index = index;
var min_pos = input_data.length;
var min_index = index;
if (index == keywords.length - 1)
return -1;
for (var i = 0; i < keywords.length; i++) {
if (i == orig_index)
continue;
var keyword = keywords[i];
var next_keyword_pos = input_data.indexOf(keyword, pos);
if (next_keyword_pos != -1 && next_keyword_pos < min_pos) {
min_pos = next_keyword_pos;
min_index = i;
}
}
return min_index;
}
The above code works fine when the keywords appear only once in the data. But in this case, the keyword "About" appears also as a word in the sentences that should be put in "about array" and "life array". The output should be:
name array contains :
The United States of America (USA), commonly referred to as the United States (U.S.) or America, is a federal republic composed of 50 states, the federal district of Washington, D.C., five major territories, and various possessions.
about array contains: 48 contiguous states and Washington, D.C., are in central North America between Canada and Mexico. The state of Alaska is in the northwestern part of North America and the state of Hawaii is an archipelago in the mid-Pacific. The territories are scattered about the Pacific Ocean and the Caribbean Sea. At 3.8 million square miles and with over 320 million people, the country is the world's third largest by total area and the third most populous. It is one of the world's most ethnically diverse and multicultural nations, the product of large-scale immigration from many countries.
life array contains:The geography and climate of the United States are also extremely diverse, and the country is home to about a wide variety of wildlife.
But since the keyword appears as a normal word, I am not getting the required output. Are there any ways to do this in Javascript? Thanks a lot in advance.
Considering your condition:
"... . The order in which these keywords appear in the text is always the
same."
the "main goal" can be achieved with the following approach using String.split, String.replace, String.substring and Array.indexOf functions:
// data is the initial string(text)
var splitted = data.split(/\.\s/), // splitting sentences
keywords = ["Name", "About", "Life", "Rest"],
currentKeyword = "", // the last active keyword
keysObject = {'name' : [], 'about' : [], 'life' : [], 'rest' : []};
splitted.forEach(function(v){
var first = v.substring(0, v.indexOf(" ")).replace(/\W/g, "");
if (keywords.indexOf(first) !== -1) {
keysObject[first.toLowerCase()].push(v.substring(v.indexOf(" ") + 1));
currentKeyword = first.toLowerCase();
} else {
keysObject[currentKeyword].push(v);
}
});
console.log(JSON.stringify(keysObject, 0, 4));
The output:
{
"name": [
"The United States of America (USA), commonly referred to as the United States (U.S.) or America, is a federal republic composed of 50 states, the federal district of Washington, D.C., five major territories, and various possessions"
],
"about": [
"48 contiguous states and Washington, D.C., are in central North America between Canada and Mexico",
"The state of Alaska is in the northwestern part of North America and the state of Hawaii is an archipelago in the mid-Pacific",
"The territories are scattered **about** the Pacific Ocean and the Caribbean Sea",
"At 3.8 million square miles and with over 320 million people, the country is the world's third largest by total area and the third most populous",
"It is one of the world's most ethnically diverse and multicultural nations, the product of large-scale immigration from many countries"
],
"life": [
"The geography and climate of the United States are also extremely diverse, and the country is home to **about** a wide variety of wildlife"
],
"rest": [
"USA is a diversified nation and Niagara is world famous."
]
}
If I'm understanding your problem correctly you want to start a new string at the first "About" but not the others that happen after. I was able to do this using string search, because it finds the first instance.
http://codepen.io/jnfr/pen/gMYbPJ
<button onclick="myFunction()">button</button>
<p id="name"></p>
<p id="about"></p>
<p id="life"></p>
<p id="rest"></p>
function myFunction() {
var str = "Name The United States of America (USA), commonly referred to as the United States (U.S.) or America, is a federal republic composed of 50 states, the federal district of Washington, D.C., five major territories, and various possessions. About 48 contiguous states and Washington, D.C., are in central North America between Canada and Mexico. The state of Alaska is in the northwestern part of North America and the state of Hawaii is an archipelago in the mid-Pacific. The territories are scattered about the Pacific Ocean and the Caribbean Sea. At 3.8 million square miles and with over 320 million people, the country is the world's third largest by total area and the third most populous. It is one of the world's most ethnically diverse and multicultural nations, the product of large-scale immigration from many countries. Life The geography and climate of the United States are also extremely diverse, and the country is home to about a wide variety of wildlife. Rest USA is a diversified nation and Niagara is world famous.";
var n = str.search("About");
var name = str.slice(0, n);
var p = str.search("Life");
var about = str.slice(n, p);
var r = str.search("Rest");
var life = str.slice(p, r);
var rest = str.slice(r, str.length);
document.getElementById("name").innerHTML = name;
document.getElementById("about").innerHTML = about;
document.getElementById("life").innerHTML = life;
document.getElementById("rest").innerHTML = rest;
}

Enumerating arbitrary object keys - Javascript

Good day,
I had a problem when i parsing this result from wikipedia.
{
"batchcomplete": "",
"query": {
"pages": {
"252408": {
"pageid": 252408,
"ns": 0,
"title": "Bulacan",
"extract": "Bulacan (Tagalog: Lalawigan ng Bulacan; Kapampangan: Lalawigan ning Bulacan) (PSGC: 031400000; ISO: PH-BUL) is a province in the Philippines, located in the Central Luzon Region (Region III) in the island of Luzon, 11 kilometres (6.8 mi) north of Manila (the nation's capital), and part of the Metro Luzon Urban Beltway Super Region. Bulacan was established on August 15, 1578.\nIt has 569 barangays from 21 municipalities and three component cities (Malolos the provincial capital, Meycauayan, and San Jose del Monte). Bulacan is located immediately north of Metro Manila. Bordering Bulacan are the provinces of Pampanga to the west, Nueva Ecija to the north, Aurora and Quezon to the east, and Metro Manila and Rizal to the south. Bulacan also lies on the north-eastern shore of Manila Bay.\nIn the 2015 census, Bulacan had a population of 3,292,071 people, the highest in Region III and the 2nd most populous in the Philippines. Bulacan's most populated city is San Jose del Monte, the most populated municipality is Santa Maria while the least populated is Doña Remedios Trinidad.\nIn 1899, the historic Barasoain Church in Malolos was the birthplace of the First Constitutional Democracy in Asia.\n\n"
}
}
}
but it creates a random number key "252408", i want to parse the value of "extract" key without declaring the random number key and extract key.
You can use Object.values() to access values corresponding to random key.
var res = {"batchcomplete": "","query":{"pages":{"252408": {"pageid": 252408,"ns": 0,"title": "Bulacan","extract": "Bulacan (Tagalog: Lalawigan ng Bulacan; Kapampangan: Lalawigan ning Bulacan) (PSGC: 031400000; ISO: PH-BUL) is a province in the Philippines, located in the Central Luzon Region (Region III) in the island of Luzon, 11 kilometres (6.8 mi) north of Manila (the nation's capital), and part of the Metro Luzon Urban Beltway Super Region. Bulacan was established on August 15, 1578.\nIt has 569 barangays from 21 municipalities and three component cities (Malolos the provincial capital, Meycauayan, and San Jose del Monte). Bulacan is located immediately north of Metro Manila. Bordering Bulacan are the provinces of Pampanga to the west, Nueva Ecija to the north, Aurora and Quezon to the east, and Metro Manila and Rizal to the south. Bulacan also lies on the north-eastern shore of Manila Bay.\nIn the 2015 census, Bulacan had a population of 3,292,071 people, the highest in Region III and the 2nd most populous in the Philippines. Bulacan's most populated city is San Jose del Monte, the most populated municipality is Santa Maria while the least populated is Doña Remedios Trinidad.\nIn 1899, the historic Barasoain Church in Malolos was the birthplace of the First Constitutional Democracy in Asia.\n\n"}}}};
let result = Object.values(res.query.pages)[0].extract;
console.log(result);
You can do something like var keys = Object.keys(batchcomplete.query.pages); to get the random key(keys[0]) and then you can use that key to fetch extract property.
You can also use for-in loop to traverse object
var wiki = {
"batchcomplete": "",
"query": {
"pages": {
"252408": {
"pageid": 252408,
"ns": 0,
"title": "Bulacan",
"extract": "Bulacan (Tagalog: Lalawigan ng Bulacan; Kapampangan: Lalawigan ning Bulacan) (PSGC: 031400000; ISO: PH-BUL) is a province in the Philippines, located in the Central Luzon Region (Region III) in the island of Luzon, 11 kilometres (6.8 mi) north of Manila (the nation's capital), and part of the Metro Luzon Urban Beltway Super Region. Bulacan was established on August 15, 1578.\nIt has 569 barangays from 21 municipalities and three component cities (Malolos the provincial capital, Meycauayan, and San Jose del Monte). Bulacan is located immediately north of Metro Manila. Bordering Bulacan are the provinces of Pampanga to the west, Nueva Ecija to the north, Aurora and Quezon to the east, and Metro Manila and Rizal to the south. Bulacan also lies on the north-eastern shore of Manila Bay.\nIn the 2015 census, Bulacan had a population of 3,292,071 people, the highest in Region III and the 2nd most populous in the Philippines. Bulacan's most populated city is San Jose del Monte, the most populated municipality is Santa Maria while the least populated is Doña Remedios Trinidad.\nIn 1899, the historic Barasoain Church in Malolos was the birthplace of the First Constitutional Democracy in Asia.\n\n"
}
}
}
}
for(var key in wiki["query"].pages){
console.log(key);
}
So, if the object under the "pages" key will always only have one key, and you don't know the value of it, you can use,
var randKeyObj = Object.keys(obj.query.pages)[0];
This will always get the first or only key of an object thats passed in.
It's the page ID, not a random number. You can use the formatversion=2 API parameter to get a plain array.

Iterating arrays

Is there a way to iterate over multiple arrays and return different values from each one?
Ex:
{
"gameQuestion": [
"English League Championship: What will be the match result?",
"2017 Boston Marathon: Which COUNTRY will the MEN'S WINNER represent?",
"MLB: Who will WIN this matchup?",
"English League Championship (Huddersfield Town # Derby County): Will Derby SCORE in the 2nd Half?",
"English Premier League: What will be the match result?",
"MLB: Who will WIN this matchup?",
"NBA Eastern Conference Playoffs - 1st Rd (Cavaliers lead 1-0): Who will WIN this matchup?",
"NBA (IND#CLE): Which PLAYER will SCORE a HIGHER PERCENTAGE of their TEAM'S TOTAL POINTS in the 1st Half?",
"NHL Eastern Conference Playoffs - 1st Rd (Series tied 1-1): Who will WIN this matchup?",
"NHL Eastern Conference Playoffs - 1st Rd (Series tied 1-1): Who will WIN this matchup?",
"MLB: Who will WIN this matchup?",
"MLB: Who will WIN this matchup?",
"MLB: Who will WIN this matchup?",
"MLB: Who will WIN this matchup?",
"NBA (IND#CLE): Will a 3-POINTER be MADE in the FIRST 2 MINUTES of the 3rd Quarter?",
"NBA Western Conference Playoffs - 1st Rd (Spurs lead 1-0): What will be the GAME RESULT?",
"NHL Western Conference Playoffs - 1st Rd (Predators lead 2-0): Who will WIN this matchup?",
"NHL Western Conference Playoffs - 1st Rd (Ducks lead 2-0): Who will WIN this matchup?",
"MLB: Who will WIN this matchup?",
"MLB: Who will WIN this matchup?"
],
"propVal": [
"m57338o58525",
"m57338o58526",
"m57336o58521",
"m57336o58522",
"m57329o4111",
"m57329o12",
"m57316o793",
"m57316o726",
"m57319o58515",
"m57319o58516",
"m57322o423",
"m57322o461",
"m57323o517",
"m57323o515",
"m57327o206",
"m57327o15",
"m57330o14",
"m57330o35",
"m57331o21",
"m57331o148",
"m57298o27453",
"m57298o112",
"m57320o58517",
"m57320o58518",
"m57318o58513",
"m57318o58514",
"m57325o481",
"m57325o479",
"m57326o463",
"m57326o5964",
"m57333o19384",
"m57333o78",
"m57334o3",
"m57334o5"
],
"info": [
"Opponents",
" Aston Villa: Win or Draw",
"# Fulham: Win",
"Kenya",
" Any Other Country",
" Tampa Bay Rays (6-7) Snell",
" # Boston Red Sox (7-5) Wright",
" Yes: Derby Scores 1+ Goals in 2nd Half",
" No: No Derby Goal in 2nd Half",
" Arsenal: Win",
" # Middlesbrough: Win or Draw",
" Chicago White Sox (6-5) Holland",
" # New York Yankees (8-4) Montgomery",
" Indiana Pacers (42-40)",
" # Cleveland Cavaliers (51-31)",
" Paul George (IND)",
" LeBron James (CLE) or Tie",
" Ottawa Senators (44-28-10)",
" # Boston Bruins (44-31-7)",
" Washington Capitals (55-19-8)",
" # Toronto Maple Leafs (40-27-15)",
" Pittsburgh Pirates (6-6) Nova",
" # St. Louis Cardinals (3-9) Lynn",
" Milwaukee Brewers (7-6) Anderson",
" # Chicago Cubs (6-6) Lackey",
" Cleveland Indians (5-7) Salazar",
" # Minnesota Twins (7-5) Gibson",
" Los Angeles Angels (6-7) Chavez",
" # Houston Astros (8-4) Morton",
" Yes: 3PM in First 2 Min of 3rd Qtr",
" No: No 3PM in First 2 Min of 3rd Qtr",
" Grizzlies: Win or Single Digit Loss",
" # Spurs: Win By Double Digits",
" Chicago Blackhawks (50-23-9)",
" # Nashville Predators (41-29-12)",
" Anaheim Ducks (46-23-13)",
" # Calgary Flames (45-33-4)",
" Miami Marlins (7-5) Koehler",
" # Seattle Mariners (5-8) Miranda",
" Arizona Diamondbacks (8-5) Ray",
" # Los Angeles Dodgers (7-6) McCarthy"
]
}
and I want to iterate over all three at the same time, but return
["English League Championship: What will be the match result?", Aston Villa: Win or Draw","# Fulham: Win",m57338o58525",m57338o58526] then [2017 Boston Marathon: Which COUNTRY will the MEN'S WINNER represent?","Kenya"," Any Other Country","m57336o58521",
"m57336o58522"] and on b I need to skip the first element.
var json = require('./output.json');
var a = json.gameQuestion;
var b = json.info;
var c = json.propVal;
var res= [];
for(var i = 0; i < a.length; i++){
res.push([a[i],b[i*2+1],b[i*2+2],c[i*2],c[i*2+1]]);
}
console.log(res[0]);
console.log(res[1]);
console.log(res[2]);
console.log(res[3]);
console.log(res[4]);
console.log(res[5]);
I've got the first iteration, but ever time I add another for loop it just ends up returning j the same amount of times as the length of the first for loop.
Update: Thanks! This problem is solved!
If i didn't get you wrong, check this
var a = [1,2,3]
var b = [69,4,5,6,7,8,9]
var c = [10,11,12,13,14,15]
var res= [];
for(var i = 0; i<a.length; i++){
res.push([a[i],b[i*2+1],b[i*2+2],c[i*2],c[i*2+1]]);
}
console.log(res[0]);
console.log(res[1]);
console.log(res[2]);
You can try this:
var a = [1,2,3]
var b = [69,4,5,6,7,8,9]
var c = [10,11,12,13,14,15]
//Since for each value in array a you need a different value in array b in a single iteration we will use different pointers for each array but value will be dependent on i
for(var i=0;i<a.length;i++){
var final = [];
j = 2*i+1; //j is the pointer that iterates over array 'b'
k = 2*i; //k iterates over array 'c'
final.push(a[i]);
final.push(b[j]);
final.push(b[j+1]);
final.push(c[k]);
final.push(c[k+1]);
console.log(final);
}

leaflet marker not displaying in certain contexts

I am using the leaflet htmlwidget implementation to draw a web-based map using R. I was looking for a specific marker, couldn't find it, and realized it wasn't being displayed at all. However, when I subset down my dataset to just that entry, the marker displays beautifully.
Here is a screenshot of the marker, with code having been run after subsetting the data to just this marker (using the simple line of R script thecounted <- thecounted[thecounted$age==6,]):
Here is the same location when placing the whole dataset down as markers.
Does anyone know what is going on? Am I hitting some arbitrary limit on the number of markers that browsers/leaflet will lay down? This is not a glitch specific to this entry as plenty of other markers are not showing either...
Here is the entirety of my code.
#download needed packages you don't have
wants <- c("magrittr", "leaflet", "jsonlite", "curl")
has <- wants %in% rownames(installed.packages())
if(any(!has)) install.packages(wants[!has])
require(jsonlite)
require(curl)
require(leaflet)
require(magrittr)
#pull data from json file embedded in the Guardian's The Counted website: http://www.theguardian.com/thecounted
thecounted <- fromJSON("https://interactive.guim.co.uk/2015/the-counted/v/1455138961531/files/skeleton.json")
#Color-code for whether the victim was armed
# Red = Unarmed
unarmedC <-"#ff0000"
# Teal = armed
armedC <- "#008080"
# Black = Don't know or ambiguous category like "Non-lethal firearm" or "vehicle"
idkC <- "#000000"
pal <- colorFactor(c(idkC, rep(armedC,2), unarmedC, rep(idkC,4)), domain= c("Disputed",
"Firearm",
"Knife",
"No",
"Non-lethal firearm",
"Other",
"Unknown",
"Vehicle"))
# automatically set date range for pulled data
today <- Sys.Date()
today <- format(today, format="%b %d %Y")
dateRange <- paste0("(Jan 01 2015"," - ", today,")")
#Use the leaflet htmlwidget to create an interactive online visualization of data
leaflet(data = thecounted) %>% #data from the counted
#add default open source map tiles from OpenStreetMap
addTiles() %>%
#fit bounds around the USA
fitBounds(-125,25, -67,49) %>%
#add a map legend
addLegend(
title=paste(sep="<br/>","People killed by police",dateRange),
position = 'bottomright',
colors = c(unarmedC,armedC, idkC),
labels = c("Unarmed", "Armed", "Unknown / non-lethal / vehicle / other")) %>%
#dynamically add markers for people who were killed
addCircleMarkers(~long, ~lat, stroke=FALSE,
color = ~pal(armed), #color defined above
fillOpacity = ifelse(thecounted$armed=="No",0.75,0.35), #make unarmed dots more visible
#create pop-up windows with some information for each marker
popup = ~ paste(name, "<br/>",
"Age",age,"<br/>",
#include race if available
ifelse(race == "B", "Black",
ifelse(race == "W" , "White",
ifelse(race =="H", "Hispanic",
ifelse(race == "A", "Asian",
ifelse(race == "N", "Native American",
ifelse(race == "U", "Race unknown", "")))))),"<br/>",
#tell us whether they were unarmed or if unknown, else leave blank
#because the categories for being armed are convoluted
ifelse(armed=="No", "Unarmed<br/>",
ifelse(armed=="Unknown", "Unknown if armed<br/>",
ifelse(armed=="Vehicle", "Armed with 'vehicle'<br/>",
ifelse(armed=="Knife", "Had a knife<br/>",
ifelse(armed=="Disputed", "Disputed if armed<br/>", ""))))),
#include cause of death
ifelse(classification == "Gunshot", "Killed by gunshot",
ifelse(classification == "Death in custody", "Died in custody",
ifelse(classification == "Other", "",
ifelse(classification == "Taser", "Killed by taser",
ifelse(classification == "Struck by vehicle", "Struck by vehicle", ""))))))
)
Missing points in leaflet is usually caused by NA data, where leaflet doesn't plot anything after the row with the NA in it.
thecounted[rowSums(is.na(thecounted)) > 0, ]
# uid armed name slug age gender race date state classification large lat long hasimage
# 738 738 Firearm Jason Hale jason-hale-738 29 M W 2015/8/19 WA Gunshot FALSE NA NA FALSE
Remove this guy and you're laughing
leaflet(data = thecounted[rowSums(is.na(thecounted)) == 0, ]) %>%
addTiles() %>%
addCircleMarkers(lat = ~lat, lng = ~long,
stroke = FALSE,
color = ~pal(armed),
fillOpacity = ifelse(thecounted$armed=="No",0.75,0.35),
popup = ~name)

How to extract data from javascript array using beautiful soup?

The javascript file looks like this:
states_arr['Chittoor']= new Array( "Kurnool (Abbas Nagar)# 9247001529 # H. No. 80-11/111, ; Beside ICICI Bank ATM, ; Near Krishna Nagar Railway Gate, ; Abbas Nagar, Kurnool.","Kurnool # 9247001530 # H. No. 46/694, Near Annapurna Hotel, Opp. Govt Hospital, Budawarpet, Kurnool. " );
I want to extract the address from all the arrays in the js file that starts after the second '#' sign i.e. " H. No. 80-11/111, ; Beside ICICI Bank ATM, ; Near Krishna Nagar Railway Gate, ; Abbas Nagar, Kurnool.",
"H. No. 46/694, Near Annapurna Hotel, Opp. Govt Hospital, Budawarpet, Kurnool. "
The above complete javascript file is available at :
http://www.heteropharmacy.com/jScript/myScript.js
I am using BeautifulSoup for that matter and here is my incorrect code:
soup = BeautifulSoup(html_doc)
script = soup.find_all("script")
pattern = re.compile(r" (?<=[0-9]\s#\s).+")
while pattern.search(script):
line1 = pattern.search(script)
print line1
This file then needs to be converted into json format.
You could clean up the file with just python -
Assume actual_file to be your js file which you have just opened
# Split it by newline character and remove all lines which have less than 2
# characters since our addresses are much longer
lines_of_js = [i.strip() for i in actual_file.split("\n") if len(i)>2]
# Now, remove lines with syntaxes of javascript and keep lines which have
# `#` in address. You may want to revisit this part for further fine tuning.
lines_with_address = [line for line in y if
all([i not in line for i in '(<>={}'])
and
('#' in line)
]
lines_with-address is now a list of such addresses
Split each line in this variable, split it by # and get the last item - this should be your address:
In [94]: [line.split('#')[-1] for line in lines_with_address]
Out[94]:
[' D.No. 5-9-24/66/1/a, Hill fort, ; Beside MLA quarters, ; Adarsh nagar, Hyderabad",',
' Plot No:23/A, Addagutta, ; Co-Opp.Housing Society Ltd. Opp. JNTU, ; HMT Hills Road, Kukatpally, Hyd",',
' Shop No.5, Plot No.86, ; Road No. 6, Vishnavy Recidency, ; Near AXIS Bank ATM, R.K.Puram,; Alkapuri Colony, Hyderabad.",',
Tested in Python2.7
You don't need bs4 on this. Just use urllib2 or Py3k equivalent to read the source.
import re
import urllib2
dta = urllib2.urlopen('http://www.heteropharmacy.com/jScript/myScript.js').read()
final = [i[2:].replace('",', '').strip() for i in re.findall(r'# (?:[a-zA-Z]).+', dta)]
Sample Output (List):
'H.No : 3-5-60/C-12 ; Opp. Andhra bank, ; Vivekananda Nagar Colony, ; Kukatpally, Hyderabad',
"Flat No.G-6, Bhavya's Srinivasam, ; Opp. Sanghamitra School, ; Nizampet Road, Hyderabad",
'H.No. 8-2-603/B/28, ; Opp. Hyderabad kababs, ; Road No-10, Banjara hills, Hyderabad',
'H. No.1-55/C/9&10, Shop No. 9 & 10, ; Raghava Towers, Main Road, ; Madinaguda',
'beside Cyberabad Police Commisionarate Office, ; Telecom Nagar, Gachibowli, ; Hyderabad"'

Resources