Categories
读书有感

python小试

今天非常无聊的决定去试一下python。找了一个题,大意如下:

  • 给定一个输入字符串,找出最漂亮的无重复子字符串。
  • 子字符串:从原字符串中减掉某些字符可得到的。
  • 无重复字符串:没有重复的字符
  • 甲比乙漂亮:甲的长度>乙,或者甲的字典排序在乙之后。

因为都是无重复的,所以肯定不需要甲的长度大于乙,故而是所有长度一样的无重复子字符串中,找出字典排序最大的。

这个先用R写的,为的是写出一个有效的算法来。基本的思路就是强行的逐层递归。

x = 'nlhthgrfdnnlprjtecpdrthigjoqdejsfkasoctjijaoebqlrgaiakfsbljmpibkidjsrtkgrdnqsknbarpabgokbsrfhmeklrle'

x_split = strsplit(x,split="")[[1]]
unique_x = unique(x_split) 
unique_x_order = sort(unique_x,decreasing=T) 
x_remain = character() 

# find the largest character than can be remained

#initialize
current_string = x_split
current_unique = unique_x
current_order = unique_x_order
while ( length(x_remain) < 20) 
{ 
  for(i in 1:length(current_order))
  { character = current_order[i]
    index = which(current_string == character)
    sub_string = current_string[min(index):length(current_string)]  
    if (length(setdiff(unique(current_string),unique(sub_string)))==0) #no lose of characters
    {x_remain = c(x_remain,character);
     current_string = current_string[-c(1:min(index),index)];
     current_unique = unique(current_string);
     current_order = sort(current_unique,decreasing=T);
     break;
    }
  }
}

#answer is 'tsocrpkijgdqnbafhmle'

后面用python重写了一遍。基本就是等价函数的替换...我是不是在暴殄天物的利用python?完全不理解program on the fly的感觉...

x = 'nlhthgrfdnnlprjtecpdrthigjoqdejsfkasoctjijaoebqlrgaiakfsbljmpibkidjsrtkgrdnqsknbarpabgokbsrfhmeklrle';
x_split = list(x);
unique_x = list(set(x_split));
unique_x.sort(reverse=True)
x_remain = list();
###initialize
current_string = x_split;current_unique = unique_x;current_order = unique_x;
while len(x_remain) < len(unique_x):
	for character in current_order:
		index = current_string.index(character);
		sub_string = current_string[index:len(current_string)];
		#print(character);
		if (len(set(current_string)-set(sub_string))==0): #no lose of characters
			x_remain.append(character);
			for i in range(sub_string.count(character)):
				sub_string.remove(character);
			current_string= sub_string;
			current_unique = list(set(current_string));
			current_unique.sort(reverse=True);
			current_order = current_unique;
			break;
print(x_remain);

最后好不容易写完python之后,发现网断了...没法在线提交了。等重新连上,时间已经过了,sigh。就当周末无聊历练一下了。

Categories
读书有感

连续>离散

我只是在试图恢复,所以顺便看点死物。

--------------------废话结束---------------------

我很佩服Andrew Gelman这样一写博客写了那么多年的,还什么都涉及到一些的,无论什么时候读起来都觉得很有收获(希望我是在进步....)。经常能在他那里看到一些“不是很大”却很基本的问题。

刚刚跑code的间隙去扫了一眼这篇Econometrics, political science, epidemiology, etc.: Don’t model the probability of a discrete outcome, model the underlying continuous variable,蛮有意思的。基调就是,如果可以选择连续变量,就不要用那些拆分出来的离散变量了。举了一些例子,baseball的那些我不熟,最后econ的那个自然是吸引眼球的——

Even in recent years, with all the sophistication in economic statistics, you’ll still see people fitting logistic models for binary outcomes even when the continuous variable is readily available. (See, for example, the second-to-last paragraph here, which is actually an economist doing political science, but I’m pretty sure there are lots of examples of this sort of thing in econ too.)

然后又翻回到那篇Estimating the incumbent-party advantage and the incumbency advantage in House elections,略读了一下明白原来Andrew是建议直接预测numbers of votes而不是预测win or lose。否则中间丢失的信息蛮可惜的——

The key is that vote differential is available, and a simply performing a logit model for wins alone is implicitly taking this differential as latent or missing data, thus throwing away information.

此外,有人觉得用binary会变得更加稳健,因为不需要对分布进一步做假设。对此,Andrew的回应和以前看到过的他的另外一篇post相同—— Everyone’s trading bias for variance at some point, it’s just done at different places in the analyses,当你把那么多时间地点的分散信息汇总在一起做回归的时候,就已经在挑战估计量的稳健性了。所以用连续变量,反而允许你在一定程度上更少的混合这些数据就可以得出比较好的估计量。

----------------检讨开始--------------

1. R里面的cut()函数需要慎用。

2. 刚刚还在试图把一个连续变量分成几段呢...默默的把写好的SQL的一堆case when删掉了,sigh。白白的码了那么半天。

Categories
读书有感

Constitutional Law by Yale 听课笔记(二)

随便整理一点东西。

Anti-Federalists and the Federalists

基本上这两派就是对联邦政府和州政府权力应该多大的争议。抄一段总结:

The Anti-Federalists opposed the new U.S. Constitution for numerous reasons.

  • They distrusted large, powerful national governments and believed liberty could only be protected in small republics in which the rulers were closely checked by the public.
  • They believed a large nation could best be governed by a confederation, with local governments having the most control. A strong national government would be distant from the people and not capable of protecting the rights of the citizens. Congress would tax too heavily and the Supreme Court would overrule state courts.
  • They distrusted the president having too much power, including a standing army under his control.
  • They also favored the addition of a Bill of Rights to protect the citizens from the national government. They wanted the House of Representatives increased in size so it would reflect a greater variety of popular interests.
  • The wanted a council created to check the actions of the president.
  • They also favored leaving military affairs in the hands of the state militias.

Federalists favored a strong national government with supreme power over state governments.

  • The rights of citizens would be protected from the government via legislation, the courts, and the Bill of Rights.
  • Federalists distrusted the masses to select the best candidates so they made only the House of Representatives directly elected by the people. Checks and Balances within the Constitution would make sure no one branch became too powerful.
  • The President would have control over the military, necessary for national defense, but could not violate the laws.The Secretary of War would advise the President.
  • The national government needed the power to tax and enforce the laws, or the ills of the Articles would hamper the development, agriculture and industry, of the new nation.

说白了,Anti-Federalists就是希望州政府更加独立,而联邦政府减少对各州的干涉。

Categories
经济、IT观察与思考 读书有感

从网上交易征税争议说起

这几年一直有对网上交易(中小卖家)是否征税的舆论争议,随便一搜新闻,淘宝就是一个箭靶子——

美帝的eBay日子也不好过...

说到这里,就不得不去翻一下美国税法对于销售税的规定。

--------------下段比较罗嗦,不关心细节这可以跳过-----------

这要起源于上世纪98年,克林顿还在的时候,通过的一项《互联网免税法案》,英文原名是Internet Tax Freedom Act。从wiki上抄一下法案的基本内容:

This law bars federal, state and local governments from taxing Internet access and from imposing discriminatory Internet-only taxes such as bit taxes, bandwidth taxes, and email taxes. The law also bars multiple taxes on electronic commerce.

简而言之,就是联邦和地方政府都不得对互联网接入征税,且不得对比特、带宽和电子邮件征税。翻了翻原始法案文件,第720页开始,到后面说了multiple taxes的定义:

IN GENERAL.—The term ‘‘multiple tax’’ means any tax that is imposed by one State or political subdivision thereof on the same or essentially the same electronic commerce that is also subject to another tax imposed by another State or political subdivision thereof (whether or not at the same rate or on the same basis), without a credit (for example, a resale exemption certificate) for taxes paid in other jurisdictions.

简单理解一下(sorry,我不是学法律的,很可能不准),就是多州不得对一项电子商务交易重复征税。2007年的时候,这项法案延续到2014年11月1日(Internet Tax Freedom Act Amendment Act of 2007)。而实践上,大多遵循1992年的一项最高法院的裁决

In Quill Corp. v. North Dakota, the Supreme Court ruled that a business must have a physical presence in a state for that state to require it to collect sales taxes.

-------------罗嗦完毕-------------

也就是说,只要没有实体店,州政府就不能强制征收消费税。有趣的就是2013年,市场公平法案(Marketplace Fairness Act ),主要内容就是对虚拟商店也要征收消费税或者使用税。众议院目前还没表决。

[声明]:下面关于eBay的知识均来源于互联网及其他公开渠道,与本人工作无关,在这里只是陈述。所有结论由文章作者负责,不代表公司观点。

那在eBay上,现在的销售税是怎么征收的呢?

Normally buyer do NOT pay tax on eBay unless the following 3 criteria all meet:

  1. The seller is a Business seller.
  2. The seller has a physical presence in buyer’s shipping address state.
  3. That state charges sales tax.

也就是说,只有从eBay上的在买家所在州拥有实体店的商业卖家那里买东西、且该州征税,那么消费者才需要为此付税。一般的案例就是Macy‘s或者bestbuy这样在eBay上开网店的。所以一般在eBay上买东西的时候,结帐是看不到sales tax这一项的(美国都是价外税,如果有销售税会在账单上写明的)。这么看,线上卖家就比线下卖家多了免付税这个优势(虽然征税是直接针对消费者征收的,但是税负的实际承担者取决于供给和需求曲线的弹性)。直白的讲,如果我在网上买一件东西包邮需要$100,家旁边的店也卖$100,但是我在店里买需要交9%的税(以加州为例),那么如果不急用,我为啥不在网上买呢?

终于铺垫完了背景,现在来看AER 2014年1月刊的一篇paper:

Einav, Liran, et al. "Sales Taxes and Internet Commerce." American Economic Review 104.1 (2014): 1-26.
这篇paper主要就是探讨,当某个州提高消费税率的时候,对实体店和网店的影响是怎么样的。他们用的只是eBay的数据,结论是:
every one percentage point increase in a state's sales tax increases online purchases by state residents by almost 2%,while decreasing their online purchases from state retailers by 3.4%.
也就是说,消费税每上升1%,会导致该州居民网购增加2%、从本地零售商网购减少3.4%(因为需要交税)。下面看一下这个结论是怎么一步步得出的。
首先看一下美国各州的消费税率:
2014-02-12 14_23_01-SalesTaxes(1).pdf - Adobe Reader

Categories
读书有感

Constitutional Law by Yale 听课笔记(一)

Coursera上期盼已久的一门课,终于在春天开课了。我一直觉得自己的法律学的太差了,或者说没受过专业的法学训练(其实还应该补一下accounting,可是我实在是懒的去考CFA)...需要恶补一下。所以这门课比较适合我的需求。去年本来计划修完世界史的那门A History of the World since 1300,结果后面各种走神...台大的《秦始皇》倒是听完了。现在不贪多了,争取听好这一门。

写笔记纯粹是为了强迫自己听课...给自己列几个简单的目标:

  • 了解美国的宪法基本知识和相应的社会制度
  • 了解诞生这样宪法的历史背景
  • 了解后面一步步的修正过程
  • 逐渐思考,这样的制度变迁是如何配合美国近代经济发展的

说白了,就是从制度经济学和历史的角度,去理解美国宪法对于经济社会生态的影响。毕竟说到底,mechanism design一直是我很喜欢的一个研究领域,而社会制度是慢工出细活的学习过程。希望接下来的几个月的时间可以达成这样的目标嗯。

-----------------------------

第一节课主要是基本的课程介绍。抄一下前几周的大纲:

  • Congressional Powers:议会
  • Presidential Powers:总统
  • Judges and Juries:庭审系统
  • States and Territories: 联邦
  • The Law of the Land: 土地法
  • Making Amends:修正案
  • Progressive Reforms and Modern Moves:改革演化和最近的进程

然后抄一些要点。纯属照抄,不代表本人倾向。

民主: 1787年,美国宪法建立。在此之前,完全没有民主(democracy)的概念,而在二百多年后的今天,哪怕是拥有十几亿人口的印度,都实现了民主选举。

美国宪法的序言:

We the People of the United States, in Order to form a more perfect Union, establish Justice, insure domestic Tranquility, provide for the common defense, promote the general Welfare, and secure the Blessings of Liberty to ourselves and our Posterity, do ordain and establish this Constitution for the United States of America.

联邦政府模型——

一句纲领:美国是一个三权分立的国家,其中立法权力归国会;行政权力归美国总统;司法权力归美国联邦法院。

国会两院模型和立法权:英国沿袭的是上下院制度。当年这样的制度设计是出于公平考虑(国会成立初期,形成贵族组成的英国上议院和以平民组成的英国下议院。而现在实际上,下议院占较大优势。英国下议院对财政预算案有先议权,上院只有为期一个月的延期通过权,公法案在下院三读通过,上院反对无效,故上院的权力只是象征。首相领导的内阁只对下院负责)。于是有人认为,这样的制度而导致新的法律条款通过相对较为困难,(比如有可能旧的不好的条款长时间存在着),倾向于选择“较少的法条”这样?(对此存在争议)

美国联邦政府参议两院制度由其演变而来,设有参众两议院。参议院(Senate),各州不论人口均派有两名代表;众议院(House of Representatives),以人口比例分配。这是出于“议员对联邦政府不信任”和“政府认为议员不具有代表性”的平衡,或者理解为各个州之间话语权分配的角逐平衡。现在,众议院中有各州众议院议员435名,而参议员来自50个州,为数100名。而关于人数,其实当年也有争论(若是太多,则几乎无法对话;差旅也是个问题)。差旅方面,一律由联邦政府负责,不由各个州出钱(为了避免只有利益相关州代表出席)。

选举方面,众议院每两年选举,25岁且在为7年以上美国公民可以被选为众议院(参议员要求30岁、9年公民)。参议员一任6年,其任期交错,故每两年有约1/3的席次改选。

如果参众任一方认为某提案是违宪的(unconstitutional law),那么不会有任何投票。此外,总统也有判定违宪权。